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abstract— The  stability  of  the  Aloha  random  access  algorithm  in  an 
■nfinite-user  slotted  channel  with  multipacket  reception  capability  is 
considered.  This  channel  is  a  generalization  of  the  usual  collision  channel, 
.n  that  it  allows  the  correct  reception  of  one  or  more  packets  involved  in  a 
collision.  The  number  of  successfully  received  packets  In  each  slot  Is 
.noaeled  as  a  random  variable  which'depends  exclusively  on  the  number 
of  simultaneous  attempted  transmissions;  This  general  model  includes  as 
opeciai  cases  channels  with  capture,  noise,  and  code  division  multiplexing, 
it  is  shown  by  means  of  drift  analysis  that  the  channel  backlog  Markov 
cnain  is  ergoaic  if  the  packet  arrival  rate  is  less  than  the  expected  number 
oi  packets  successfully  received  in  a  collision  of  rt  as  n  goes  to  infinity. 
Finally,  the  properties  of  .the  backlog  in  the  nonergodlcity  region  are 
examined. 


I:  Introduction 

,  aNE  of, the  main  problems  in  random  access  communications 
V/islthe  determination  of  the  maximum  stable  throughput.  In 
/■articular,  an  important  result  is  that  the  Aloha  protocol  is 
unstable  ,[l]-[3];  in  an.  infinite-user  slotted  collision  channel 
wnere  a  transmission  is  successful  only  if  no  other  users  attempt 
transmissions  simultaneously.  Several  strategies  have  been  de¬ 
signed  to  stabilize  this  channel,  such  as  collision  resolution 
algorithms  (see  [4];  for  example)  where  transmissions  are 
deferred  until  the  current  conflict  is  solved,  and  more  recently, 
,-doha-type  strategies  using  decentralized  control,  where  the 
etransmission  probability  is  updated  according  to  previous 
nannel  outcomes.  It  has  been  shown  [5]-[7]  that  the  maximum 
stable  throughput  achievable  by  such  Aloha-type  strategies  with 
decentralized  control  is  e~'. 

However,  the  collision  channel  model  does  not  hold  in  many 
Important .  practical  multiuser  communication  systems  [8]-[21] 
‘■ecause  simultaneous  transmission  of  several ,  packets  does  not 
necessarily  result  in  the  destruction  of  all  the  transmitted 
information.  For  instance,  the  capture  phenomenon  is  common  in 
:ocal  area  radio  networks  [12]-[15];  if  the  power  of  one  of  the 
.eceivea  packets  is  sufficiently  large  compared  to  the  power  of  the 
other  packets  involved  in  a  collision,  then  the  strongest  packet  can 
be  correctly  decoded,  while  the  other  packets  are  lost.  Other 
examples  are  multiple-access  channels  where  several  users 
transmit  simultaneously,  in  the  same  frequency  band,  and  a 
multiuser  detector  demodulates  the  information  transmitted  by  all 
active  users  (e.g.,  [8]-[ll]j.  Although  those  systems  do  not 
necessarily  require  a  random  access. protocol,  it  is  sometimes 
useful  to  exercise  some  flow  control  through  such  a  protocol  so  as 
to  limit  the  maximum  number  of  simultaneous  transmitters,  in 
order  to  bound  the  multiuser  receiver  complexity,  and  guarantee 
lower  bit-error  rates. 

Manuscript  received  July  23,  1987;  revised  January  8,  1988.  Paper 
recommended  by  Past  Associate  Editor  A.  Ephremides.  This  work  was 
supported  in  part  by  the  Office.of  Naval  Research  under  Contract  NOOOI4-87- 
k-0054  and  by  lhe  Army  Research  Office  under  Contract  DAAL03-87-k- 
0062. 

The  authors  are  with  lhe  Department  of  Electrical  Engineering,  Princeton 
University,  Princeion.  NJ  08544. 

IEEE  Log  Number.8821359. 


Previous  studies  of  some  of  the  aforementioned  systems  [9], 
[12]-[18]  where  some  of  the  packets  involved  in  a  collision  may 
be  correctly  received  have  shown  that  the  performances  arc 
noticeably  improved  with  respect  to  slotted  Aloha.  However,  even 
in  those  special  cases,  no  precise  stability  result  is  available,- either 
because  finite  population  networks  with  no  buffer  space  were 
considered,  or  because  the  Poisson  approximation  of  channel 
traffic  was  used  for  infinite  population  networks.  In  [19]  (see  also 
[20]),  upper  and  lower  bounds  are  derived  for  the  capacity  of  a 
multiple  access  channel  where  all  packets  are  correctly  received  if 
the  collision  size  does  not  exceed  a  fixed  threshold  and  otherwise 
all  packets  are  destroyed. 

In  this  paper,  we  consider  a  generalization  of  the  collision 
channel,  where  the  receiver  can  demodulate  several  packets 
simultaneously.  It  is  assumed  that  the  number  of  correctly 
demodulated  packets  is  a  random  variable,  which,  given  the 
number  of  packets  simultaneously  transmitted,  is  independent  of 
the  backlog  and  of  the  number  of  previous  retransmission 
attempts.  This  random  variable  can  take  any  integer  value 
between  zero  and  the  collision  size.  Thus,  the  channel  is  described 
by  a  matrix  of  conditional  probabilities  (e„k)  where  t„k  is  the 
probability  that  k  packets  are  correctly  demodulated  given  that 
there  were  rt  simultaneous  transmissions.  We  analyze  the  usual 
Aloha  algorithm  with  the  multipacket  reception  capability  just 
described.  Users  are  synchronized  so  that  transmissions  take  place 
within  one  slot,  and  at  the  end  of  each  slot,  stations  that  did 
transmit  a  packet  learn  whether  or  not  their  transmission  was 
successful.  Unsuccessful  or  backlogged  packets  are  retransmitted 
in  each  subsequent  slot  with  probability  p;0  <  p  <  1.  It  turns  out 
that  multipacket  reception  capability  can  stabilize  Aloha.  Our 
main  result  states  that  the  maximum  stable  throughput  is  equal  to 
the  limit  of  the  average  number  of  packets  correctly  received  in 
collisions  of  size  rt  when  n  goes  to  infinity.  To  show  this,  we 
model  the  channel  backlog  as  a  Markov  chain,  and  then  study  its 
properties  by  using  some  simple  drift  analysis  techniques. 

The  last  part  of  this  paper  is  a  study  of  the  properties  of  the 
backlog  in  the  nonergodicity  region.  Unlike  the  backlog  Markov 
chain  for  slotted  Aloha  which  is  always  transient  [1],  the  backlog 
for  our  model  does  in  general  have  a  null  recurrence  region  of 
positive  length,  which  depends  on  the  matrix  (enk)  and  on  the 
retransmission  probability  p.  However,  transience  in  the  nonergo¬ 
dicity  region  can  be  ensured  for  a  large  class  of  systems,  and  in 
particular  for  channels  where  the  number  of  successful  simultane¬ 
ous  transmissions  is  bounded. 

II.  Multipacket  Reception  Model 

Let  A*  be  the  number  of  new  packets  arriving  during' time  slot 
k.  Assume  that  (A k)kzo  are  i.i.d.  fandom  variables  with 
prooability  distribution: 

P[/t*=/i]=X„  (»*  0) 

such  that  the  mean  arrival  rate  X  =  S".,  n\„  is  finite.  New 
packets  are  transmitted  with  probability  one  at  the  beginning  of 
the  first  slot  following  their  arrival. 

Given  that  rt  packets  are  being  transmitted  in  one  slot,  we  define 
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forn  >  1,0  ^  k  n 

e„k=P[fc  packets  arc  correctly  received \n  are  transmitted). 

The  multipacket  reception  properties  of  the  channel  are  summa¬ 
rized  by  the  stochastic  matrix 


*10 

^20 

«n 

*21 

6 22  0 

*/jO 

which  we  refer  toas  the  reception  matrix  of  the  channel.  For 
instance,  the  reception  matrix  for.the  usual  collision  .channel  is 


"0  1 
I  0 
1  0 
1  .. 


0 


while  for  a  system  with  capture  it  has  the  form 

0 


0 

\-x2 


1 

*2 


1  ~Xn  X„ 


where  x„  is  the  probability  of  capture  given  that  the  collision  size 
is  n.  The  model  studied  in  [19],  [20]  can  be  described  by  a 
reception  matrix  of  the  form 

~0  1 

0  0  1  0 

0  0  1  . 

1  0 

1  0  0 


Note  that  by  letting  €|0  *  0  our  model  allows  not  only  collisions 
but  also  background  noise  to  be  a  source  of  errors. 

Denote  by  X„  the  number  of  backlogged  packets  in  the  system 
at  the  beginning  of  slot  n.  The  discrete-time  process  (Xn)n±o  is 
easily  seen  to  be  a  homogeneous  Markov  chain.  We  define  the 
system  to  be  stable  if  (,Xn)ni0  is  ergodic  and  unstable  otherwise. 
The  average  number  of  packets  correctly  received  in  collisions  of 
size  n  is  denoted  by  C„  =  ,  kt„k.  We  can  now  state  the  main 

result. 

Theorem  1:  If  C„  has  a  limit  C  =  limn_,,  C„,  then1  the  system 
is  stable  for  all  arrival  distributions  such  that  X  <  C  and  is 
unstable  for  X  >  C.  This  also  holds  if  Cis  infinite:  if  lim„^»  C„ 
=  +  oo,  then  the  system  is  always  stable. 

The  proof  is  given  in  Section  III.  In  the  -  remainder  of  this 
section,  we  use  Theorem  1  to  analyze  several  simple  random 
access  channels  that  fall  within  the  scope  of  the  multipacket 
reception  channel. 

1)  Mobile  Users  with  Pairwise  Transmissions:  Consider  an 
infinite  number  of  transmitters  Tu  T2t  *••,  and  an  infinite 
number  of  receivers  R  j,  R2,  -  •  • ,  whose  positions  in  the  plane  are 
i.i.d.  random  variables.  Suppose  that  transmissions  are  pairwise 

'This  result  holds  under  the  assumption  that  the  Markov  chain  of  the 
number  of  backlogged  packets  is  irreducible  and  aperiodic  (for  details  and 
sufficient  conditions,  see  Section  III). 


A 


i 

i 

i 

i 

i 


© 


A 


A 


A  :  TRANSMITTER 
Q  :  RECEIVER 

Fig.  I .  Pairwise  transmissions  with  only  one  success  (3-3). 


in  the  sense  that  transmitter  T„  sends  packets  only  to  receiver  R„, 
and  R„  is  only  interested  in  the  packets  sent  by  T„  (see  Fig.  1). 
Assume  also  that  each  receiver  can  only  detect  correcdy  the 
packet  sent  by  the  closest  transmitter  (in  particular,  this  is  the  case 
if  there  is  perfect  capture,  see  Example  3  below).  The  successes  of 
transmissions  occurring  at  the  same  time  are  independent,  so  that 
for  n  £:  2 


en*=(j)p(«)*0  -Pin))”-* 

where  p(n)  is  the  probability  that  any  given  transmitter  is 
successful  in  a  collision  of  size  n,  which  is  equal  to  \/n  if  we 
assume  that  all  locations  are  memoryless,  i.e.,  independent  from 
slot  to  slot.  It  follows  that 

C„=np(n)=\ 

and  the  maximum  throughput  is  1 .  More  generally,  if  because  of 
channel  noise,  the  message  of  the  closest  transmitter  is  received 
correctly  with  probability  a  (in  other  words  =  a),  then  the 
throughput  is  equal  to  a.  The  assumption  that  the  locations  of  the 
stations  are  memoryless  is  equivalent  to  assuming  that  they  move 
infinitely  fast.  If  this  simplifying  assumption  is  dropped,  then  the 
number  of  successes  depends  not  only  on  the  current  number  of 
retransmissions,  but  also  on  the  previous  history  of  retransmis¬ 
sions,  arid  thus,  the  problem  is  no  loriger  encompassed  by  our 
multipacket  reception  model.  In  Fig.  2,  the  result  of  a  simulation 
shows  that  for  moderate  speeds,  the  actual  throughput  is  well 
approximated  by  the  foregoing  analysis. 

2)  Frequency  Hopping  Random  A  ccess  Channel:  Consider  a 
finite  population  of  N  users  transmitting  by  frequency  hopping,  as 
in  [11],  [22].  For  each  packet  he  wants  to  transmit,  a  user  selects 
with  equal  probability  one  frequency  in  a  fixed  set  of  q 
frequencies.  A  packet  is  correcdy  received  iff  no  other  packet  u 
transmitted  on  the  same  frequency  during  the  same  slot.  We 
compute  .  (</v*)i  sLs/v,  and  C  =  lim^-a  CN.  If  the  users  have 
infinite  buffer  space,  then  Ccan  be  taken  as  a  good  approximation 
for,  large  N  of  the  maximum  stable  throughput  of  the  system, 
which  is  unknown.  If  the  users  Have  no  buffer  space,  as  is  often 
assumed,  the  backlog  Markov  chain  is  always  ergodic,  but  even 
then,  one  should  expect  reasonable  delays  in. large  population 
problems  only  for  arrival  rates  below  C.  The  coinputation  of  the 
reception  matrix  of  this  chanriel  is  a  simple  combinatorial  problem 
of  random  assignment  of  objects  .to  cells  (e.g.,  see  [23,  App.  A]). 
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j  +  k  constant,  we  get 


10 


15  20  25 

VELOCITY 


30 


<*;=(;)  PlS={TltT2,  •••,  7}}] 

and  the  following  decomposition: 

Pl{Tu  T2,  •••,  7}}  c  S]  =  P(S={7'|,  r2,  •••,  Tj)) 


+P  U  .iiTu-tTJtTk}SS}] 
L*-/+i  J 


easily  yields  the  desired  expression 
n-J 

P[S={7-„  T2,  7)}]=£  (-1)* 

*-0 


{NkJ)  P[{ r,>  r2>  " (2) 


where  only  one  term  is  left  to. compute 

P[{Tu  t2,'-,  r*+y}  s  s] 

?(?-!)  (q-j-k  +  \)(q-j-k)N-J-k 


(3) 


g(g-l)  "•  (q-j-k+l)(q-j-k)N~J-k 


(4) 


qNCN=^  9(9- !)  (?-/+l)te-/)"-1 


/-I 


f-l 


2  (-i)- 


N! 


riM  0 


n\(i-n-  1)!(/V-/)1 


which  can  be  simplified  as 

(i-l)l(AI-i)! 


^-D  ...  (q,-/+  l)(qr_/)^-/(i_i)f-i 


to  get  the  final  result . 


Fig.  2.  Throughpul  as  a  function  of  velocity,  for  mobile  users  with  pairwise 
transmissions.  Stations  moving' in  a  square  region;  velocity  units:  percent¬ 
age  of  square  side  iraveled  in  one  slot.  Retransmission  probability  set  to 


Denote  by  T„  T2,  •  •  •,  7^  the  users,  all  involved  in  the  collision, 
and  also  denote  by  S  the  set  of  users  whose  packets  are  correctly 
received.  Two  cases  need  to  be  considered. 
a)  2  <  N  <  q:  We  have,  for  1  <  j  £  N 


b)N  >  q:  In  this  case,  there  can  be  at  best  q  -  1  successes 
in  a  collision  of  size  N.  The  same  method  applies  to  get  the 
following  probabilities: 


(1) 


q(q-.l)  •••  (q-j-k+\)(q-j-k)N-j-k 


1) 


<W=  0  (.qZjtzN) 

resulting  in  the  same  expected  number  of  successes  as  before 


for  I  <  J  <  N,  0  £  k  ^  N  -  j.  Putting  (1),  (2),  and  (3)  together 
gives  the  result 


Now  we  let  the  population  size  N  go  to  infinity  and  we  apply 
our  result.  If  we  let  N  grow  to  infinity  while  keeping  q  constant, 
we  have  lim*-™  CN  =  0,  so  the  system  is  always  unstable.  On  the 
other  hand,  if  we  let  N  go  to. infinity  while  keeping  q  equal  to  a 
fixed  percentage  of  the  population  size,  i.e.,  N/q  constant,  then 
linv-oi  CN  =  +  oo,  and  the  system  is  always  stable.  It  is  easily 
shown  that  to  get  a  finite  maximum  stable  throughput,  q  has  to 
grow  as  Nl In  ’N. 

3)  Mobile  Radio  Network  with  Capture:  Consider  an  infinite 
number  of  Users  independently  and  uniformly  distributed  in  a 
circle  of  radius  R,  whose  positions  are  independent  from  slot  to 
slot.  Users  transmit  packets  to  a  common  receiver  located  at  f  - 
center  of  the  network.  Denote  by  P i  and  P2  the  received  powers 
of  the  strongest  and  the  next  to  strongest  packets  involved  in  a 
collision.  Assume,  as  in  [12]-[14J,  that  the  strongest  packet  is 
coiTectly  received  iff  P\/P2  >  K  (K  being  a  system  dependent 
constant),  and  that  all  the  other  packets  involved  in  the  collision 
are  not  received  successfully.  Assume,  moreover,  that  the 
received  power  of  a  packet  only  depends  on  the  distance  r  between 
the  sender  and  the  receiver 


constant 


P= 


Then  there  will  be  capture  iff 


(as:  2). 


for  i  S  y  <  N.  Notice  in  particular  that  =  0-  Let  us  now 
compute  the  average  number  of  packets  correcdy  received  in 
collisions  of  size  N,  CN  =  By  using  (4)  and  summing  at 


r2>pri 

where  /J  =  Kl/a  is  the  capture  parameter,  and  ru  r2  are  the 
distances  of  the  closest  and  the  next  to  closest  senders  from  the 
receiver. 
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Denote  by  D  the  distance  between  a  given  user  and  the 
receiver.  It  is  easily  shown  that  the  pdf  of  D  is  given  by 


Po(r)  =  2—2  (0  <r<R). 

Given  iVusers,  denote  by.  UN  the  closest  from  the  receiver,  and  by 
Z?^.its  distance  from  the  receiver.  Computing  the  cdf  of  DN  and 
taking  its  derivative,  we  obtain 

( OSrSR ).  (5) 

Given  DN  =  r,  the  other  N  -  1  users  are  uniformly  distributed  in 
the.  annular  region  (r,  R).  So  if  Abusers  collide  and  DN  =  r.  Us 
will  be  correctly  received  iff  all  the  other  users  are  in  the  annular 
region  (fir,  R),  which  is  empty  if  fir  >  R.  Therefore,  if  we  denote 


and  for  /  &  1 


Pi,i.k=  £  X„  ]£  Bi(j)e„+j.n+k  (1  <ksi) 

« ■ 0  j*k 

Du=h>  B,(0)+f  S,0>;o  I  +  i;  x„  X  B,U)(nHn 

L  jm  l  -In*-'  ;-o 


Pi.i+*=  2  2  B‘U)ej+k+n.n  (*2=  1 ).  (8) 

nnO  y-o 

Sufficient  conditions  for  (X„)„z0  to  be  irreducible  and  aperi¬ 
odic,  are  as  follows: 

•  ifO  <  p  <  1: 


by 

Xo=£  0 

(9a) 

PN(r)  =  P[capture|N collide,  DN=r]  (Afa2) 

00 

we  have 

Ao+  ]£)  X„eM<  1 

(9b) 

R2-02r2l*->  ..  R 

MD-  ''1. 

0  if  ra- 

*  if  p  =  1 : 

n.  1 

C|0^  1 

(9c) 

Thus,  the  probability  of  capture  in  a  collision  of  N  (N  2:  2)  is 

0  , 

(9a) 

fNi=jo  Ps(r)pDN(r)  dr. 

OO 

Xo+  ^ 

n*  1 

(9b) 

Using  (5)  and  (6),  and  with  the  change  of  variable  x  =  0/R,  this 
is  easily  computed 

for  all  is  1,  e/0^l.  (9d) 

These  are  only  sufficient  conditions,  but  they  hold  for  almost  all 

fi/a  i 

e*i  =  ]  2Nx(\-pW)N-'dx=-2. 


It  follows  that  C  =  1//32  is  the  maximum  stable  throughput. 
Notice,  in  particular,  that  for  (3  =  1  (perfect  capture),  we  have  C 
=  1  and  for  |3  -*  t»  (no  capture),  we  have  C  -*  0. 

Under  certain  conditions,  the  performances  of  Aloha  in  the 
multipacket  channel  can  be  improved  by  varying  the  retransmis¬ 
sion  probability  as  a  function  of  the  channel  history,  and  a 
maximum  stable  throughput  of  supxi0  C„/n\xn  can  be 

reached  (see  [31]). 

III.  Ergodioty  Region 

The  number  of  backlogged  packets  in  the  system  at  time  n, 
(X„)nio,  is  a  homogeneous  Markov  chain  whose  one-step 
transition  probability  matrix  can  be  computed  as  a  function  of  p, 
(Ajt)tio.  and  E.  Denoting  by  Bj(j)  the  probability  of  having  j 
retransmissions  out  of  i  backlogged  packets 


”■0) 


pJ(l-p)'~J 


nontrivial  systems.  For  example,  if  (9b)  does  not  hold,  then  zero 
is  an  absorbing  state,  since  the  left-hand  side  of  (9b)  is  equal  to 
/»«•  Also,  (9c)  simply  means  that  the  successful  reception  of  a 
single  packet  in  the  absence  of  other  active  users  is  possible. 
Assume,  for  instance,  that  0  <  p  <  1  and  that  the  arrivals  are 
Poisson  distributed.  Then  we  only  have  to  assume  (9c),  and  (9b)  is 
true  unless  there  is  perfect  reception,  that  is  =  1  for  all  n  2:  1 , 
in  which  case  the  system  would  of  course  always  be  stable.  The 
case  p  =  1  gives  rise  to  a  number  of  pathological  situations, 
hence,  the  much  stronger  condition  (9d).  It  generally  turns  out 
that  either  (9d)  is  not  necessary  or  the  stability  region  of  the 
system  is  obvious.  For  instance,  it  is  clear  from  the  transition 
probabilities  that  slotted  Aloha  with  p  =  1  is  always  unstable.  In 
any  case,  it  is  assumed  in  what  follows  that  (Xn)ni0  is  irreducible 
and  aperiodic. 

Proof  of  Theorem  I:  The  proof  is  based  on  drift  analysis. 
Recall  that  in  general,  the  drift  at  state  i  (/  >  0)  is  defined  by 

di=E[X,i\-Xt\X,=ii). 

If  we  denote  by  2,  the  number  of  successful  transmissions  in  slot 
t,  we  have 


Xt+i  —  X,=A,—2, 


-nd  therefore 


n0=  A<}+  M 


X  (£&1) 

.<!-« 


ci,=X-£[2,|*,=  n.  (10) 

Now  if  R,  is  the  number  of  retransmissions  in  slot  /,  we  get 
-°[S/= Xr| A'/  =  /,  A,=n,  Rt=j]  =  (n+j,k 
tor  0  5  j  £  /,  0  £  tc  £  n  +  j  and  with  the  convention  that  too  = 
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Also,  if  Z,  is  an  upper  bound  for  Cn 


Cq  =  0.  Thus, 

E[^i\-X(= i,~ A/= tt,  Ri=j]  =  Cn+j 
and 

EV,\Xt=i]=  2  \  2  B,U)Cn+j.  (11) 

,/j*0  y-  0 

The  value  of  the  drifts  for  our  model  follows  from  (10)  and  (11) 

co  I 

dj=\-  2  K  2  B,0)C„+J.  (12) 

n  *0  jm 0 

The  idea  of  the  proof  is  to  compute  lim,-.*  dt  which  will  turn  out 
to  be  a  very  simple  expression,  and  then  apply  the  results  of  [3] 
and  [24]  to  determine  the  ergodicity  region  of  (X„)ni0.  Let  us  first 
recall  the  two  results  that  will  be  used  in  the  sequel. 

Lemma  A  (Pokes  [24]):  Let  (A'Jnj.o  be  an  irreducible  and 
aperiodic  Markov  chain  having  as  state  space  the  nonnegative 
integers,  denote  by  (P,y)  its, transition  probability  matrix,  and  by 
d;  its  drift  at  state  /.  Then  if  for  all  / \d\  <  »,  and  if  lim  sup/-„  d, 
<  0,  (Xn)„i0  is  ergodic. 

Lemma  B  (Kaplan  [3]):  Under  the  assumptions  of  Lemma  A, 
if  for  some  integer  N  >  0  and  some  constants  B  >  0,  c  6  [0,  1} 
the  following  two  conditions  hold,  then  (X„)n7,c  is  not  ergodic: 

i)  for  all  i  >  N,  dt  >  0 

ii)  for  all  i  >  N,  all  0  G  [c,  1],  01  -  Sy Py6 1  >  -5(1  -  0). 
From  (12),  it  can  be  sefen  that  |cff|  is  finite  since 

oo  / 

\d,\<\+  A„  J  B,(j)Cn+jS2\  +  ip. 

fl«0  jmO 

Next,  the  drift  limit  is  given  by  the  following  lemma. 

Lemma  1:  If  C„  has  a  limit  C,  finite  or  not,  then  lim/-.«  2“ 
Ky.oB,<j)cn+J=  C. 

Proof  of  Lemma  1:  We  consider  two  separate  cases 
depending  on  whether  C'is  finite. 

1)  C  =  +oo  . 

Fix  A  >  0  and  pick  r  >  0  such  that  A,  *  0.  There  exists  an 
integer  M  such  that  for  all  n  2s  M,  C„  >  A .  Fix  such  an  M.  Then 
we  have  for  /  £  M 

2  A„  2  B,(j)C„.j>K2  B,(j)Cj„>\rA  2  B,(j) 

n-o  y.o  y.o  j.M 

which  terminates  the  proof,  since  for  any  fixed  M  2:  0 


im  ^(7)=!.  (13) 

)  C  <  +  oo. 

Ve  have  for  i  >  M 

2  £  *i(J)Cn+j-C  B,U)  2  A„| Cfl+y—  C| 

fl.O  y-0  Jm o  nrn o 

+  2  W)  2  MC^y-Cl.  (14) 

’■Af+1  n»0 

?ix  e  >  0.  There  exists  M such  that  for  all  n  >  M.  \Cn  -  Cj  < 
/  2.  Fix  such  an  M.  Then 

t  B‘V)  2  Xfl|Cfl+y-C|<|. 

j + 1  n*0  "" 


2  BiU)  2  \„\Cn+J-C\x2L  2  B,U)<- 

jmO  flw  0  jm  0 

for  /  big  enough  because  (13)  holds,. which  takes  care  of  the  first 
term  in  (14)  and  ends  the  proof  of  Lemma  l. 

Putting  together  (12)  and  Lemmas  A  and  1,  we  get  that  1)  if 
limn^M  C„  =  +oo,  then  lim,^a-  dj  =  -co,  and  (Xn)n*o  is 
ergodic;  and  2)  if  limn-M  Cn  =  C  <  +  oo,  then  lim^oi  dj  =  X  - 
C,  and  (X„)nz o  is-ergodic  for  X  <  C.  If  X  >  C,  we  can  apply 
Lemma  B  and  conclude  that  (Xn)n2.0  is  not  ergodic  provided  that 
Kaplan’s  condition  ii)  holds.  TTiis  is  the  purpose  of  Lemma  2, 
•which  is  the  last  step  in  the  proof  of  Theorem  1. 

Lemma  2:  If  for  all  n  s  1 ,  C„  <  L  for  some  L  G  (0,  oo),  then 
Kaplan’s  condition  holds:  there  exists  a  constant  B,  an  integer  N, 
and  a  real  c  G  [0,  1]  such  that 

O'- 2  Pifi] 2: -BO -0)  all  i2:N,  9  6  [c,  1], 

i 

Proof  of  Lemma  2:  According  to  [25],  it  is  enough  to  show 
that  the  downward  part  of  the  drift,  defined  as 

D(i)=  -  2  kPi,i-k 

km  1 

is  bounded  below.  From  the  transition  probabilities  (8),  we  get 
D(i)=  -  2  k  2  Xn  2  B,(j)en+j,ntk 

A:«l  /»■  0  jmk 

which  can  also  be  put  in  the  form 

D0)=  -2  B<(J)  2  2  k(»+J.n+k 

Jm I  fl.O  *«1 

from  which  it  follows  that 


£(/);> -2  BAj)  2  \nCn+j*-L. 

jm  1  /i«0 

□ 

Notice  that  in  the  proof  of  Theorem  1  (and  this  also  holds  for 
'"heorem  2  below),  the  exact  expression  (7)  for  B,(j)  is  never 
used.  The  only  requirements  are  that  {B,{j))osjsi  is  a  probability 
distribution,  and  that  (13)  holds.  Therefore,  our  results  are  valid 
for  a  larger  class  of  retransmission  policies  than  was  first 
•ssumea.  For  example,  there  could  be  K  priority  groups,  each 
with  a  different  retransmission  probability. 

Although  Theorem  1  is  quite  general,  in  many  practical  cases, 
the  reception  matrix  has  a  very  simple  structure  and  the  stability 
region  can  be  obtained  with  virtually  no  computations.  This 
happens  for  instance  in  radio  networks  with  capture  where  all  is 
ne«led  is  the  limit  of  the  second  column  of  the  matrix,  or  also  in 
the  simple  case  where  above  a  certain  collision  size  N,  the 
transmission  is  too  garbled  for  the  receiver  to  be  able  to  decode 
anything  correctly,  so  that  Cn  =  0  for  n  >  N, 

'rhis  last  example  is  a  particular  case  of  a  noteworthy  feature  of 
Theorem  1,  namely  that  the  stability  region  does  not  depend  on 
any  finite  number  of  rows  of  the  reception  matrix.  In  fact,  any 
number  of  modifications  of  the  matrix  that  leaves  lim„-o.  C„ 
unenanged  does  not  affect  the  stability  region.  Although  this  may 
be  surprising  at  first  sight,  it  can  be  intuitively  explained  by  the 
fundamental  instability  of  the  collision  channel:  unless  the 
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receiver  is  perfect  (all  e„„  equal  to  1),  the  backlog  will- eventually 
-xceed  any  prefixed  value  with  probability  one,  thus  it  is  the  limit 
n  C,  that  determines  the  stability  region. 

"he  stability  region  is  also  unchanged  if  the  first  transmission  of 
rackets  is  delayed.  .-If  new  packets  are  backlogged,  that  is, 
-ransmitted  for  the  first  time  with  probability  p  in  each  slot 
ollowing  their  arrival  (this  transmission  rule  appears  in  the 
iterature  as  controlled-access  or  delayed  first'  transmissipn);  the 
i  rifts  become  d;  =  X  -  2}.i  for  /  >  1,  and ’.'from 

;-emmas  1  and  2  the  ergodicity  region  remains  the  same. 

f  C,  does  not  have  a  limit,  Theorem  .1. does  not  give  the  stable 
■Jiroughput  of  the  system.  Even 'though,  in  almost, all.  practical 
ases.and  indeed  in  all  the  examples  of  Section'll,  Cj,  doeshave  a 
imit.  it  is  conceptually  interesting  to  examine  the  case  when  lim 
-nf„-.»  C„  =£  lim  sup„_„C„.sIt  is  worth  pointing  out  that  adding 
-onstraints  as  strong  as  the  following  on  the  reception  matrix  still 
toes  not  imply  that  C„  has  a  limit: 

i  (e^tl/rai  is  nondecreasing 

ii  Unkhz  k  is  nonincreasing  for  all  k>  1 

iii  fpr/t>2,,l<*s/i- 1 

tithough  the  counterexamples  we  have  been  able  to  build  are 
:omewnat  contrived.  Notice  that  conditions  it  and  ii)  above. imply 
Hat  each  column  has  a  limit  or*  =  lim„-,m  e„k(k  >  0),  which  is 
'erv  likely  to  happen  in  practice.  In  any  case,  Theorem  2  below 
aiil  gives  some  information  on  the  stability  region,  although  the 
■xact  result  reauires  in  general  the  complete  knowledge  of  the 
:eauence  In  fact,  given  any  nonnegative  numbers  a  <  y 

li,  one  can  construct  a. reception  matrix  with  nth  row  average 
such  that: 

i  lim  inf  C„-a 

,“*00 

ii  lim  sup  C„=/3 

,“•00 

«na  such  that  the  maximum  stable  throughput  is  y. 

heorem  2:  The  system  is  stable  for  X  <  lim  inf„-»  C„  and 
instable  for  X  >  lim  sud„-.„  C„. 

’roof: 

i)  If  X  <  lim  inffl-o,  C„,  then  (X„)ns.0  is  ergodic. 
f  lim  inf,-.„  C„  =  +oo,  then  lim*-,,  C„  =  +«,  and  the 
esuit  has  already  been  proved,  so  assume  that  lim  inf«-»  C„  is 
mite.  From  Lemma  A.  it  is  enough  to  prove  that  for  all  e  >  0, 
here  exists  N  such  that 

ji<X-  lim  inf  C„+e  all  ifcAf. 

.-*0» 

Recall  from  (12)  that  wc  have 

*=x-  tx  £  b,u)c„+j.  (15) 

•  »u  0 

:o  it  is  only  needed  to  prove  that  for  all  €  >  0  there  exists  N  such 
hat 

J  £  BiU)C„tJ>  lim  inf  C„—t  all  ifeAT. 

-  f-*o* 

■  ■U  Jm0 

4ow  bv  definition  there  exists  M  such  that  for  all  k  £  M: 

-V>Tim  inf  C„-e 
,-*0* 


and  therefore  for -all  /  >  M: 

oo  l  I 

2  S  Bi{J)Cn+j>Vim  mf  C„-e)  j  B,(j) 

n*0  y-0  n  jmM 

which  completes  the  proof  since  (13)  holds, 
b)  If  X  >  lim  sup^-o,  C„,  then  is  not  ergodic. 

Since  X  is  finite,  in  this  case  lim  sup„_„  C„  is  necessarily  finite. 
Therefore*  . (C'^/iai  is  bounded  and  from  Lemma  2,  Kaplan’s 
condition  holds.  Thus,  it  is  enough  to  show  that  for  all  e  >  0, 
there  exists  N  such  that 

di>\-  lim  sup  Cn-(  all  i>N. 

From  (15),  we  only  need  to  show 

*  t 

X/i  S  BiU)C„+J<  lim  sup  C„  +  e  all  i>N. 

t ■ U  jm 0 

"ince  there  exists  M  such  thatTor  all  k  &  M 

Ck<  lim  sup  Cn  +  t 
.,-*00 

then  if  L  is  an  upper  bound  for  C„,  we  have  for  ikM 

»  /  .w-i 

2  B;U)Cn+j<L  y  5,0)+  lim  sup  C„  +  e 

/r-*oo 

/I"0  jrn  0  jmO 

from  which  the  result  follows,  using  (13).  □ 

V.  Behavior  of  the  Backlog  Markov  Chain  in  the 
Nonergodicity  Region 

n  this  section,  we  further  investigate  the  properties  of  ( X„)„zo 
in  the  case  X  >  C,  assuming  of  course  that  (C„)nai  has  a  finite 
'imit.  It  has  been  proved  in  [1]  that  the  backlog  Markov  chain  for 
die  usual  slotted  Aloha  algorithm  is  transient,  but  this  result 
cannot  be  generalized  to  our  model  when  X  >  C.  We  give  below 
an  example  showing  that  (X„)ni0  can  be  null  recurrent  when  the 
mean  arrival  rate  X  belongs  to  an  interval  of  positive  length.  The 
boundary  between  the  null  recurrence  and  the  transience  regions 
generally  depends  in  a  rather  complicated  manner  on  both  the 
reception  matrix  and  the  retransmission  probability  p.  We  give  a 
sufficient  condition  for  (X„)„zo  to  be  transient  when  X  >  C,  as 
well  as  bounds  on  the  recurrence  region. 

Consider  the  reception  matrix  defined  by 

^=■^2  (l^Arsn) 


for  ns  I.  Then  C„  -  SJ.,  k/n1  =  (rt  +  l)/2n,  and  C  =  1/2. 
'Ising  Lemmas  C  and  D  below,  we  show  in  [26]  that  Xh  is 
ecurrent  for  X  <  R(p)  and  transient  for  X  >  R(p),  where  R(p)~ 
is  a  function  of  the  retransmission  probability  p  and  is  given  by 

R(p)=^+(-~^ln(l-p)  (0<p<  1) 

p  P 

«( 0-1. 

it  is  easily  seen  that  R(p)  is  an  increasing  function  of  p  for  p  € 
’0,  1[  with  extrema  lim„-.0  R(p)  =  1/2  and  lim^!  R(p)  =  I. 
Jig.  3  summarizes  the  behavior  of  X„  for  this  example. 

It  is  somehow  surprising  to  see  that  in  this  case,  as  well  as  in  all 
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ie.  3.  Transience  and  ergodiciiy  regions  as  a  function  of  the  retransmission 
irooability  when  e„*  =  1  In1. 


‘he  other  examples  we  have,  computed,  the  recurrence  region 
■ecomes  larger  as  ^'increases.  Intuitively,  the  recurrence  of  Xn 
vnen  X  >  C  seems  to  be  due  to  the  fact  that  transitions  from  any 
:tate  t  to  0  (or  to  some  fixed  integer  k0)  are  possible  and  that  the 
irooability  of  such  an  event,  P,0  (or  Piko),  goes  to  zero  slowly 
vith  /.  It  can  be  checked  that  these  probabilities  are  increasing 
unctions  of  /?  when  i  is  large  enough. 

ransience  is  ensured  for  X  >  C  if  the  supremum  of  the 
tements  of  the  ATh-column  goes  to  zero  faster  than  k2.  This 
onaition  holds  for  all  the  examples  in  Section  II,  as  well  as  for 
nanv  real  life  cases,  due  to  the  practical  limitations  on  the 
ecetver  capabilities.  In  particular,  it  is  always  verified  if  the 
eceptton  matrix  has  only  a  finite  number  of  nonzero  columns  (or 
uuivaiently,  if  the  backlog  Markov  chain  has  uniformly  bounded 
■ownwards  transitions,  as  defined  in  [3])  which  happens,  for 
nstance.  if  there  is  capture.  Note  that  the  proof  of  Theorem  3 
•elow  is  of  course  valid  for  the  conventional  collision  channel, 
,na  in  this  case  becomes  somewhat  simpler  than  the  proof  in  [1]. 

heorem  3:  If  lim*-.  k2  supns*  (nk  =  0,  then  (X„)nl;o  »s 
ransient  for  X  >  C. 

Recause  of  the  complexity  and  lack  of  structure  of  the  one-step 
ransition  probabilities  (8),  few  results  on  the  recurrence  and 
ransience  of  Markov  chains  can  be  applied  to  our  model.  Before 
■roving  Theorem  3,  let  us  introduce  the  following  two  criteria 
rom  127], 

emma  C:  Let'  (Xn)„20  be  an  irreducible  and  aperiodic 
Markov  chain,  having  as  state  space  the  set  of  nonnegative 
megers,  and  with  one-step  transition  probability  matrix  P  - 
Pii)-  (X„)nio  is  recurrent  if  and  only  if  there  exists  a  sequence 
yn)mo  such  that 

)  lim  y„=  +oo 

i-*0» 

T  for  some  integer  N>0  }]  yjPyZyi  all  i£/V. 

-o 

Ve  will  onlv  use  the  sufficiency,  part,  which  has  also  been  proved 
.n  [24], 

^emma  D:  With  the  same  assumptions  as  in  Lemma  C, 
iX„)m o  is  transient  if  and  only  if  there  exists  a  sequence  (y„)„±o 
ouen  that 

1)  (yn)mo  is  bounded 

OB 

2)  for  some  integer  N>  0  yjPij^yi  all  izN 

/- o 

3)  for  some  kzN yk<y0,  1- 
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Sufficiency;under  the  additional  constraints  yt  >  0  and  lim,--®  y; 
=  0  has  also  been  proved  in. [28] .  Also,  the  sufficiency  parts  of 
both  lemmas  are  an  i mm ediate ■  co nseque nee  of  (29,  Theorems  5 
and  6]  together  with  the  results  in  [30]: 

Proof  of  Theorem  3:  We  use  Lemma  D  with  y„  =  1  /(«  + 
l)9,  6  E  ]0,  1[.  We  have 


£  Pijyj^yi «*  £  ( yi-k-yi)pi.i-k 


+  £-( yi+k~yi)Pi,uk s°  (16) 

*-i 

and 

(«+  1),+9  £  OV-*-->7)^./-*+('+  D,+S 

CO 

£  (ynk-y,)Pi.,+k=D’(i)+U'U)  (17) 

*-l 

where  we  have  defined 

[scrimp] 

CO  i 

'  £  S  BlU)tn+J,n+k 

n*0  /-* 

[<7TiW-(7Tip] 

•  £  X*+n  £  B,0X+*+,.n.  (18) 

n-0  /- 0 

The  drift  of  Xn  at  state  /  can  be  computed  from  the  transition 
probabilities  (8) 

/  oo 

d,=  -  £  kP,,,.k+  S  kPu+k =£>(/)  + 1/(0  (19) 

*-i  *«i 

where  we  have  defined 

i  »  i 

~  £  ^  £  Bi(J)(n*j,n+k 

V  ■  1  n  *  0  Jmk 

U(i)=  £  £  £  X„+*  £  BlU)eJ  +  k  +  n.n.  (20) 

..-■I  n-0  7-0 

The  idea  of  the  proof  is  to  show  that 

lim  lD'(0+t/'(0]=  -6  lim  </,  (21) 

/-<*  i-m 

and  since  it  has  been  proved  in  Section  III  that  lim/-*  di  =  X  - 
C,  we  will  be  able  to  conclude  that  (X„)nt0  is  transient  for  X  > 
C. 

1)  lim  [D'(i)+6D(i))=0. 

I— co 
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From  (18)  and  (20) 


2  Xn  2  Bi{j)tn+j,n  +  k 

n-0  j-k 


which  is  more  conveniently  written  as 

D'(/)+0Z)(/)=(/+ 1)  f)  XfflS  BlU)  £ 

n-0  j-l  k- I 


If  lim*..,.  k2yk  =  0,  then  linw  1/nS"  ,  Ar27*  =  0.  So  we  can 
choose 7  large  enough  so  that  for  n  >  (/  +  l)/2,  S?  £27*  < 
ne.  Then 

*■«)*•  jriil  (ttt+j)  ■ 

Now  if  we  choose  /  big  enough  so  that  for  k  >  (i  +  3)/2,  we  have 
7*  <  e/Ar2,  then 


ot»  i 

x2(i)Xt  2  X,  ^  (i+l) 

n-0  *-((  +  3)/2 


7  '+1  V  ,  0£ 

_\/+l-*/  "  I  +  l  J  £',+A.»+*- 


This  expression  is  nonnegative  since 


i+l  V  ,  6k  n  „  , 

vH — r  -1-— ->0  (l</t<i). 
/+ 1  -A:  /  i+l  '  ' 


Define  7*  =  sup„at  en)t.  Then 

Q£D'(i)  +  dD(i)<(i+  1)  S  2  j,(7)  £ 


n-0 

>1  *-l 

-i-iil 

t+i 

7n+* 

\(  ''+1 

-Y-1  -1L 

t  J  i+l 

V/+1-- 

[_\I+1-Ary  /+lj(«+Ar)2 

[(*£*)  ■ 

By  bounding  the  sum  in  the  last  equation  by  integrals,  it  can  be 
seen  that  it  is  upper. bounded  by  a  linear  function  of  /. 

2)  linw  [£/'(/)  +  dU(i)]  =  0. 

From  (18)  and  (20) 

vw+mo-in*  1, 


hat  is 

J’(i)+6D(i)£Xi(i)+Xi(i) 
vuh.  assuming  for  instance  that  /  is  odd 


We  show  that  Xi(i)  and  x2(i)  go  to  zero  independently.  Fix  e  >  0. 
Define  for  0  <  x  <;  /  the  function 

It  is  easily  proved  that  for  each  i  >  1,  p,(x)  is  a  positive 
nondecreasing  function  of  x.  Also 


00  / 

2  X*+n  2  Bi(j)€j+k  +  n.n- 

n-0  y* 0 


With  a  change  of  variable 


J'(i)+6U(i)  =  (i+  1)  2  5/(7)  2  x«  2 

.'-0  n-i  *-1 


f/  /+ 1  V  0A:  ' 

J^+i+aY  "  i+ij 


By  using  the  following  inequalities: 


OH^-~-l+8X<ie<.l+e)^  (AT2=0,  0<<?<  1)> 


we  get 


'’(¥)=?77i4(2'-1)-29'-??! 

where  A  is  a  positive  constant  depending  only  on  0.  From  (23) 


»  (!-M)/2  A  »  /i  +  (/+l)/2 

*i (0  =  2  X«  2  *2A(A0Yn+*^T— r  2  xn  2  *27"**- 


.1-0  jt-i 


n-0  *-1 


Osf/'(i)+0f/(i)s0^^(/+l)2  B,U) 


N  n  k2 

2  x"  2  77+  n2  in*i-n~k 

n-l  *-l'  ' 

+  0(i+  1)  2  ^lO)  2  ^2  TIT  e”tJ.n-k 

J-  0  7I-W+ 1  *-l‘ 

.  W  e> 

s—  2  «2\,+  2  wX'>- 

+  1  n-l  n-w+l 

Fix  e  >  0.  Choose  N  such  that  2£+..n\„  <  e/2,  and  then,  IV 
being  fixed,  choose  i  large  enough  so  tnat  l/(i  +  1)2* .  n2\„  < 
e/2.  □ 

It  should  be  clear  at  this  point  that  unlike  the  ergodicity  region, 
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•he  recurrence  region  depends  in  general  on  the  elements  of  the 
eception  matrix  (instead  ofeonly-  the  row  ..averages)  . and  .on  the 
retransmission  probability  p.  For  .this  reason,. an  exact  expression 
or  the  recurrence  region  seems,  rather  difficult  to-  obtain; 
■lonetheless.  the  method  (see  (26])  that  we  used  to -study  the 
xamDie  in  Fig.  3  can  be  generalized  to  obtain  the  following  upper 
ma  lower  bounds.on.the  recurrence  region. 

heorem-4:  (X)^0  is  recurrent  for  X  <  Land  transient  for  X 
(J.withL  =  max  {/,,  sup0<9<i  4,.sup0<fl<i  //}-and  U  =  min 
«i,  infoe«<i  «9.  info<9<i  u',}  where 


-Jim  (/*■„  i  X,  |  m 


.VjHm(/+.l)t-»’ 'S-K'S  B,(j) 

/ml  * 


4  [(i+n+l)i-(i+n^k+.l)'Ui,iJtk 

V*  I ' 


lim  (/+ 1)  [In  (/+ 1)]»-»  |)  X;,  j  5,0") 

;-o  j-i 


^  [[In  (/+«+  l)]9-[ln  (i+n-k  +  l)]»]en+M 

:-I 


■na 


*-!>"»  (/+ 1)  [In  (/+l)]z  £  X„  2  B,-(y) 

,-o  >/.t 

^  r _ i _ i  i 

;m\  [_ln  (i+n  +  2-k)  ln(/+n  +  2)J  n+lt 

!,=-jim(i+l),+»|;xfl^B,(y) 

,.0  im t  I 

j  _J _ i  I 

lU+n+l-k)'  (/+n+l)»J 

'i=^(/+l)nn(/+l)],+»2xn2/?/0) 

,■0  j«t 

‘V  j  1 _ 1  1 

~x  J_[ln (i+n+2—k)Y  [In (/+/r+2)]# J  £n*J‘k ' 

Ve  are  assuming  that  the  limits  above  exist,  which  indeed 
ianpens  in  most  practical  cases.  The  theorem  is  valid  if  any  of 
hese  limits  is  infinite.  In  particular,  if  L  =  +oo,  then  Xn  is 
■iwavs  recurrent.  Note  that  usually,  it  is  not  necessary  to  carry  out 
nl  the  computations,  because  one  of  the  three  terms  in  the 
definition  ofL  is  equal  to  one  of  the  terms  in  the  definition  of  U, 
n  feet,  in  most  cases,  we  have  supo<«i  /#  =  inf0<i<t  u,  ifO  <  p 
1 .  and  Hi  =  /,  if/?  a  1,  The  proof  of  Theorem  4  can  be  found 

•n  [26]. 
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abstract— In  this  paper,  we  analyze  an  optical, ’  dirtd-detectioa  DPSK 
eceiver  whose  error  probability  is.  quantum-limited  as  the,  transmitting 
aser  iiaewidth •-  vanishes.  The  receiver  design  is.  based,  oa. a  binary 
-auiprobable  hypothesis  test  with  doubly  stochastic  point  process  o iter¬ 
ations.  the  conditional  random  rates  of  which  depend  on  the  transmit- 
inc  laser  phase  noise,  which  is:modeied  as  a  Brownian  motion.  The 
-eceiver  structure  consists  of  a  simple,  delay-and-sum  optical  preproces- 
or?  followed  by  a  photoelectric  converter  and  an' integrate-and-dump 
ircuit.  Unper  and  lower  bounds  oh  the  receiver  bit  error  rate. art  derived 
■v  developing  bounds  oh  the  conditional  rates  of  the  point  process,  and  it 
« shown  that  the  error  probability  bounds  converge  to  the  true  value  as 
-be  transmittinc  laser  Iiaewidth  decreases.  Bounds  on  the  power  penalty 
re  comnuted  for  parameters  correspondiagto.  existing  semiconductor 
n lectio n  lasers,  and  are  seen  to  be  less  than  the  limiting  power  penalty  for. 
be  balanced  DPSK  receiver. 

.Introduction 

N  differential  phase  shift  keying  (DPSK),  information  is 
conveved  by  the  carrier  phase  in  the  current  symbol  interval 
eiative  to  that  in  the  previous  interval.  While  less  efficient 
ban  Dhase  shift  keying  (PSK),  DPSK  is  less  sensitive  to  large 
■nase  noise  amplitudes  by  utilizing  phase  noise  correlation  in 
aiacent  symbol  intervals.  Demodulation  in  conventional 
radio  frequency)  DPSK  systems  can  be  achieved  by  multiply- 
ng  the  total  received  scalar  held  by  a  delayed  version  of  itself, 
ollowed  bv  integration  over  a  symbol  period  [1].  However, 
•ue  to  the  lack  of  efficient  ontical  multipliers  and  sharp  filters, 
his  receiver  structure  is  not  yet  feasible  at  optical  frequencies. 
\n  alternative  solution  is  to  heterodvne  the  received  optical 
agnai  to  the  microwave  frequency  range,  and  employ  a 
onvenuonal  demodulation  scheme.  This  heterodvne  DPSK 
eceiver  has  been  analyzed  previously  [2].  While  incurring  a  3 
iB  loss  inherent  to  the  heterodyning  operation,  this  receiver 
voids  the  need  to  count  nhotons,  which  may  introduce  an 
Dpreciable  loss  in  some  existing  avalanche  photodiodes  [3]. 

n  this  Daper,  we  assume  an  ideal,  photon-counting  device 
=na  concentrate  on  the  design  and  analysis  of  a  direct-detection 
>PSK  receiver.  Performance  is  measured  by  the  power 
•enaity,  which  is  the  ratio  of  the  transmitted  optical  power 
eauired  to  achieve  a  given  bi|  error  rate  to  that  required  by  a 
eceiver.  whose  power  requirement  is  determined  solely  by  the 
statistical  nature  of  the  ohotodetection  process.  Thus,  the 
■ower  penalty  is  a  measure  of  demodulation  efficiency,  and  a 
eceiver  with  0  dB  cower  penalty  is  described  as  quantum 
imited.  In  f2],  a  balanced,  direct-detection  DPSK  receiver 
vas  found  to  have  a  Dower  penalty  of  at  least  3  dB.  We  analyze 
n  this  caper  a  DPSK  receiver  whose  power  penalty  is  0  dB  for 
:  transmitting  laser  with  no  phase  noise,  and  less  than  3  dB  at 
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10~9  bit  error  rate  (BER)  for  existing  semiconductor. injection 
lasers. 

The  remainder  of  this  paper  is  organized  as  follows.  In 
Section.  II,  we  formulate  a.  DPSK  receiver  from  a  binary 
hypothesis  test  with  point  process  observations.  The  random¬ 
ness  of  the  rates  of  the  point  process  under  each  hypothesis  is 
due  to  the  transmittinglaser  phase  noise,  which  is  modeled  as 
a  Brownian  motion.  It  is  shown  why  the  optimum  receiver  is 
not  feasible,  and  then  attention  is  restricted  to  simple  strategies 
that  use  photon  counting  only.  The  proposed  DPSK, receiver 
implements  a  subbptimum  .biliary  hypothesis  test  preceded  by 
a  delay-and-sum  optical  preprocessor.  In  Section  HI,  we 
analyze  the  receiver  performance  as  a  function  of  system 
parameters.  The  exact  error  probability  expression  depends  on 
the  moment  generating  function  (mgf)  of  a  functional  of  the 
aser  Dhase.  noise  sample  path.  Since  this  mgf  appears  to  be 
intfactabje,  the  receiver  error, probability  is  bounded  by  using 
in  alternative  functional,  whose  mgf  is  computable. 

n. -Optical  DPSK  Receiver 

In  binary  DPSK  modulation,  the  transmitted  scalar  field  is 
amplitude-modulated  by  a  bit  stream  derived  from  the  infor¬ 
mation  sequence.  Denoting  the  information  sequence  as  { •  •  • 
b_i,.Ao,  •  •  • }  where  b„  G  {  —  1, 1}  is  the  information  bit  in 
the  time  interval  nT  is  t  <  (n  +  1)7”,  we  compose  a  sequence 
{•  ’  •  a_ i,  a0,  ai  •  ’  •},  a„  E  {-1,  1}  from  the  relation an.ia„ 
4  b,  to  amplitude-modulate  the  transmitted  scalar  electric 
field.  Under  the  assumptions  of  spatial  homogeneity  and 
distortionless  transmission,  the  transmitted  (and  received) 
scalar  electric  fields  may  be  described  as 

=  cos  (pf+0,)=V2  Re  , 

nT*t<(n+\)T  (1) 

where  the  Brownian  motion  {6„  t  €  ffi}  models  the 
transmitted  laser  phase  noise  and  v  is  the  carrier  frequency. 
Note  that  in  the  absence  of  laser  phase  noise  the  transmitted 
optical  energy  is  f  4  -4 2 772  photons  per  bit. 

The  decoding  from  {ai)  to  {bt}  is  performed  in  the  same 
operation  that  compares  the  received  signal  in  [n  T,  (n  +  1)7) 
to  the  reference  signal.  This  suggests  the  demodulation 
scheme  shown  in  Fig.  1.  The  optical  signal*  described  in  (1)  is 
divided  by  a  half-silvered .  mirror  into  two  signals  of  equal 
power,  one  of  which  is  delayed  by  the  symbol  period  T,  and 
then  added  to  the  other.  The  resultant  optical  signal  is  incident 
on  the  photodetector  in  Fig.  1  and  is  given  by 

r(/)=4=  {a*A  cos  (vt+ 0,)  +  aK.xA 

v2 

•  cos  ([pf+0,]- [*7+40,])} 

=  V2  Re  lE,eJ’']  (2) 
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vnereZT,  4'  A/2eJ,'{an  is  the  complex 

cnvelobe  of  the  electric  field  incident  on  the  photodetector, 
-aid  AO,  is  an  increment  of  Brownian  motion  given  by 

1  0,-6,-  t-  (3) 

tie  remaining  receiver  structure  will  be  based  on  the 
iccfrdn  point  process  observations  at  the  output  of  the 
•notodetector.  This  process  may  be  modeled  as  a  doubly 
itbchastic  point  process  with  count  N,,  occurrence  times  { 1VU 
Vi,  •  WN,},  arid  rate  X,  =  or|i5V| 2  +  d  where  a  and  d  are 
•ammeters  of  the  photodetector  [4].  It  should  be  pointed  out 
riat  it  is  equivalent  to  express  the  rate  as  X,  =  ar\t)  +  d,  and 
o  ignore  die  double  frequency  terms.  Although  the  following 
esuits  may  easily' be  extended  for  arbitrary  values  of  a  and  d, 
ve  assume  in  the  following  that  the  photoelectric  conversion  is 
■deal,  i.e.,  a  =  1  and  d  =  0.  Under  these  assumptions  the  rate 
•rthe  electron  departure  process  for 0  S  <  <  Tis 

.2 

•  [1  +  b0  cos  AO,],  0 £t<T  (4) 

vnere  we  have  assumed'*?"  =  0  mod  2r.  The  hypothesis  pair 
or  the  interval  TO,  T)  may  be  described  as 

Vis  a  doubly  stochastic  counting  process  with  intensity: 


:ine width  in  Hertz,  y  4  1/B,T,  and  choosing  xo  such  that  1  - 
<  Xo  <  e~w/y,  we  have 

-*[3/  E  [0,  T]  :  |cos  AOj-E  [cos  A0,]|>;to] 

_  le~  (T/32»)[cd.-l(e-('/7)-Jf0)l2.  (7) 

his  upper, bound  follows  from  Lemma  1  in  Section  HI.  For  y 
I 000,  and  x0  =  20  percent  of  E  [cos  AO,],  the  left-hand  side 
•r  (7)  is  less  than  0.03.  For  y  =  1500  and  ato  =  20  percent  of 
j  fcos  A0t],  the  probability  is  less  than  0.003.  So  the 
■rooability  that  a  sample  path  of  the  intensity  deviates  from  the 
mean  in  a  symbol  interval  by  more  than  20  percent  is  bounded 
-iDoveby.a'small  number  for  reasonable  values  of  y.  Based  on 
this. argument,  we  employ.the  (constant)  mean  E  [cos  A0,]  to 
estimate  theprocess  {cos  A0„  f'E  [0,  T]},  and  the  suboptimal 
estimates  of  the  conditional  rates  follow  directly.  The  advan¬ 
tage  to  assuming.a  constant  estimate  is  that  the  test  assumes 
•homogeneous  Poisson  point  .process,  observations,,  and;  the 
decision  strategy  is  very  simple:  compare  the  photon  count  to  a 
threshold. -By  (7)  we  can  expect  this  strategy  to  be  optihiurii  for 
large  y. although.it  may  be  far.  from  optimum  for  low  y. 

A  suboptimal  receiver.desigh  is  specified  below.  Since  {0„  t 
G  35}  is  a  Brownian  motion  with  zero  drift  and  diffusion 
coefficient  \I2tB,,  {A0„  t  E  ®}  is  a  stationary  Gaussian 
random  process  with  zero  mean  and  autocorrelation  function 


0  otherwise 


The  means  of  X^  are  easily  computed 

AK?)=j,U-e-T'y) 

^[X}0J=^(l+e-^).  (9) 


fo  :  X,= X(°>  4  ^  (1  -  cos  A0,)  0  *t<T 


X,=X(»  4  -^(1  +cosA0,)  0 zt<T  (5) 


where  f  is  the  transmitted  optical  energy  in  photons  per  bit  as 
defined  earlier.  If  we  define  4  £'[X<,,|  {//„;  0  £  a  <  /}], 
then  the  test  which  minimizes  the  probability  of  error  is  [5] 


Nt 

Ti  lo8 


i-i 


If) 


(6) 


The  formulation  of  this  log-Iilcelihood  ratio  has  an  interesting, 
two  part  structure.  First  derive  the  minimum  mean  squared 
error  causal  estimates  of  the  intensities  given  the  observation 
under  each  hypothesis,  and  then  solve  a  binary  hypothesis 
testing  problem  with  observations  from  a  nonhomogeneous 
Poisson  process  whose  conditional  rates  are  these  estimates. 
This  fact  has  commonly  been  referred  to  as  the  separation 
theorem  of  defection,  and  motivates  the  use  of  suboptimal 
estimates  in  hypothesis  testing  with  doubly  stochastic  point 
process  observations  [4],  [5]. 

Unfortunately,  the  explicit  structure  of  (6)  is  unknown  due 
to  the  difficulty  in  evaluating  the  conditional  estimates  X’1'. 
One  approach  to  this  problem  is  to  -  replace  the  optimum 
estimates  with  suboptimuin  estimates.  In  this  paper,  we 
propose  the  suboptimum  estimates  ifIXj'*].  A  justification  of 
this  approach  is  the  following.  Denoting  B,  as  the  laser 


Employing  these  suboptimal  estimates  in  (6),  a  suboptimal 
hypothesis  test  is 


where  [_•  J  indicates  the  greatest  integer  function.  Due  to  the 
stationarity  of  the  laser  phase  noise  increment,  the  suboptimal 
test  is  not  a  function  of  the  photon  arrival  times,  and  the 
remaining  portion  of  the  DPSK  receiver  need  only  count  the 
number  of  received  photons  in  the  interval  [0,  T],  The  entire 
suboptimal  receiver  structure  is  shown  in  Fig.  1 .  Note  that  the 
threshold  obtained  in  (10)  is  not  the  threshold  that  minimizes 
the  error  probability  of  a  test  which  uses  the  statistic  Nr-  The 
optimum  threshold  is  also  a  function  of  the  optical  power  and 
7,  and  is  closely  approximated  as  a  by-product  of  the  error 
probability  analysis  of  Section  m. 

It  is  worthwhile  to  compare  this  DPSK  receiver  to  the 
balanced  receiver  analyzed  in  [2].  The  balanced  receiver 
divides  the  received  optical  signal  into  four  streams  of  equal 
intensity,  two  of  which  are  delayed  by  the  symbol  period  T.  A 
delayed  stream  is  added  to  a  nondelayed  stream  in  the  same 
way  as  in  the  proposed  receiver,  except  at  one-half  the  power. 
The  remaining  pair  of  optical  streams  are  subtracted.  The 
outputs  of  the  summer  and  subtractor  are  each  input  to 
photoelectric  converters,  and  the  resulting  counts,  are  sub¬ 
tracted.  The  resulting  hypothesis  test  requires  no  .threshold 
setting.  That  is,  the  difference  of  the  electron  counts  is 
compared  to  zero,  and  the  receiver  is  independent  of  y. 
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.However,,  in  using,  this  information  about  the  -laser  phase 
noise,  the  proposed  DPSK  receiver  requires  one-half  the 
transmitted  optical.power  of  the  balanced  DPSK  receiver  to 
achieve  the  same  error  rate.  This  claim  will  be  verified  in 
Section  in.  This  comparison  assumes  tSt.the  three-port  beam 
combiners  used  in  both  receivers  are  lossless..  It  should  be 
noted  that  if  a  Mach-Zehnder  interferometer  is  used  as  the 
beam  combiner  [6],  then  the  proposed  DPSK  detector  uses 
only  one  of  the  two  output  port  beams,  and. the  energy  of- the 
signal  incident  on  the  photodetector  is  one-half  of  that  assumed 
in  the  analysis  of  this  paper.  In  this  case,  the  performance  of 
otir  receiver  is  asymptotically  equivalent  to  that  of  the 
balanced  .detector  in  [2],  implemented  using  both  output  ports 
•of  a  Mach-Zehnder  interferometer  [7] . 

As  suggested  above,  the  parameter  y  is  central  in  the 
analysis  of  optical  . communication  systems  employing  coher¬ 
ent  light  with  nonzero  linewidth.  It  characterizes  the  perform¬ 
ance  'degradation  due  to  the  transmitting  laser  phase  .jitter 
relative  to  the  symbol  rate.  For  fixed  laser  linewidth,  the  effect 
of  the  phase  noise  on  system  performance-is  less  pronounced’ 
as  the  symbol  rate  increases,  as  reflected  by  an  increase  of  y. 
Typically,  y  £  [50,  1600],  which  follows  from  B )  €  [6  MHz, 
20  MHz]  for-semiconductor  injection  lasers'[8], -arid-symbol, 
rates  from  1  to  10  Gbits/s. 


<m.  Performance  Analysis 

In  this  section,  we  characterize  the  performance  of  the 
proposed  DPSK  receiver.  We  show  that  as  y  -*  oo  the 
probability  of  error  is  the  quantum  limit  of  optical  communica¬ 
tions.  We  then  derive  upper  and  lower  bounds  on  the 
probability  of  error  for  arbitrary  y  and  show  that  these  bounds 
converge  to  the  true  value  as  y  -*  oo.  Finally,  we  present 
Monte  Carlo  simulation  results  of  the  receiver  performance 
and  compare  them  to  the  error  probability  bounds. 

We  begin  by  showing  that  the  performance  of  the  proposed 
DPSK  receiver  is  quantum  limited  as  y  -*  oo.  In  this  case,  the 
transmitting  laser  is  ideal  and  it  is  easy  to  see  that  (10) 
becomes 

tfrio.  (11) 

As  y  —  oo,  the  rates  under  each  hypothesis  are  deterministic 

and  Nt  is  an  unconditionally  Poisson  random  variable  under 
each  hypothesis.  If  we  define  A /  A  { J  \<0  dt,  i  €  {0,  1}  then 
the  probability  of  error  is  • 

P[error]=ip[ATr=0|fl'1]=ie-Ai,  (12) 


Since  Ai  =  2f  for.  an  ideal  transmitting  laser,  (12)  becomes 

P[errorl=^<?-2r  (13) 

which  is  known  as  the  quantum  -limit.  Thus,  the  receiver 
performance  is  quantum  limited  as  the  transmitting  laser 
linewidth  goes  to  zero. 

Next,  we  consider  the  error  probability  for  finite  values  of 
y.  It  is  convenient  to  define  the  moment  generating  function 
A fi(v)  4  E[e‘Ai]  and  to  let  N( A)  denote  a  Poisson  random 
variable  -with  mean  A.  Conditioned  on  the  rate  and  the 
hypothesis,  the  observation  process  is  a  Poisson  point  process. 
Therefore,  the  probability  of  error  under  H\  may  be  found  by 
first  conditioning  on  {Ad,,  Os/s  7} 


/J[error|//,]=£[i»[^(A1)s/|{Ad„0^f^r})] 

i  l  dk 

=  V\k\d7kMliv)lmTl- 
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where  /  is  an  arbitrary  nonnegative  integer  threshold,  and  the 
last -line  follows  from  an  application  of  the  bounded  conver¬ 
gence  theorem.  By  a  similar  argument  we  have  tinder  H0 

P[error|f/0]  =  1  -  £  ^  TT^oOOU  -i-  (15) 

*.o  kidv 

It  appears  that  there  is  no  closed-form  expression'  for  the 
foregoing  moment  generating  functions  when  {Ad„  0  SIS 
7}  is  a  7  increment  of  Brownian  motion.  In  the  remainder  of 
this  section,  we  consider  tipper  and  lower  bounds  to  (14)  and 
(15).  Based  on  the  fact  that  the  mgf  of  1/27  (Ad,)2  dt  is 

computable  (see  the  Appendix),  our  approach  is  to  find  upper 
and  lower.bounds  on  the  eiror  probability  based: on  quadratic 
bounds  of  cos  x 


xi 

\-—<casx^fa(x)  4 

At 


where  0  <  a  <  1  and  x„  is  the  smallest  positive  real  number 
such  that  1  —  ax\/2  =  cos  x„.  If  each  cosine  in  the 
expressions  for  A,  is  replaced  by  the  upper  bounding  function, 
the  corresponding  rates  are  further  apart,  hence  it  is  easier  to 
discriminate  between  them  and  a  lower  bound  on  the  error 
probability  is  obtained.  Analogously,  an  upper  bound  is 
obtained  if  each  cosine  is  replaced  by  the  lower  bounding 
function.  More  precisely,  the  cumulative  distribution  function 
(cdf)  of  a  Poisson,  random  variable  is  a  strictly  decreasing 
function  of  its  mean,  that  is,  P[N(A)  s  l]  is  decreasing  in  A. 
If  the'error  probability  under  each  hypothesis  is  found  by  first 
conditioning  on  {Ad„  0  £  /  s7},  then  the  conditional 
probabilities  are. Poisson  cdf’s,  and  substituting  the  bounds 
from  (16)  in  the  expressions  for  A,  will  yield  upper  and  lower 
bounds  for  the  conditional  probabilities  as  well  as  the 
unconditioned  probabilities.  Denoting  A(°  as  the  bound  on  A, 
which  yields  an  upper  bound  to  the  error  probability  on  Ht, 
and  A f  as  the -bound  on  A,  which  yields  a  lower  bound  to  the 
error  probability  under  Hh  we  have 

AK=2r-A0y 

A£=-^-/.(A0,)tfr 

Af  =  2r-A£.  (17) 

As  seen  by  (14),  (15),  and  (17)  computation  of  these  bounds 
requires  the  moment  generating  functions  of  Aj  and  A^.  The 
following  two  lemmas  loosen  jwo  of  the  bounds  so  that  we 
require  only  ♦(«)  4  jE,[<i'ao  ],  which  is  derived  in  the 
Appendix!  Lemma  1  quantifies  the  fact  that  the  probability  that 
a  7  length  sample  path  of  Ad  deviates  far  from  the  origin 
decreases  with  increasing  y.  Lemma  2  uses  this  fact  in 
deriving  i  a  lower  bound  on  the  error  probability  under  Ho, 
which  does  not  depend  on  the  mgf  of  Aj. 

Lemma  1:'  Let  x  £  ®+.  Then 


1-fly  if.|x|<x0 

(16) 

t/1  otherwise 


P[3t  6  [0,  7]  :■ \M,\>x]<;4Q 


(14) 


•Q[x]  4  l/s/2r  J;  e-Mdt. 
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Proof:  Let  {W„  i  E  05+ }  be  a  Wiener  process,  and  *  4 
x/2yl2rBt.  For  v’E  08+,  we  define  the  stopping  times  T,  4  inf 
{t:lV,  >  v}  and  T_v  4  inf  <  —  u}.  Then, 

P[3t  e  [0,  T]  :  |A0,|>*]  •. 

=p[3t  e  CO,  71 :  \e^d,_T\>x} 


<p 


.[a re  [0,271:  \e_T+l-e.T\>~ 


=P  3 1 


3t  €  [0,  271 :.\B,-d0\> 


x~ 

2. 


=P 


3/ €  [0,27]:  \W,\>- 


2V2  rBi 


(18) 


=P  [min  (7«,  7_,)<27] 

</>[r«<2r]+p[T-,<2r]  u?) 

=  2P[7,<27]  (20) 

=  2{P[7,<27,  W2TZK)+PiT'<2T,  W2 r<*]} 

=  4P[7,<27,  W-lt^k)  (21) 

±*P{W2T*k} 


Equation  (18)  follows  from  the  Markov  property  of  the 
Brownian  motion,.  (19)  from  the  union  bound,  (20)  from  the 
fact  that  -  IF  has  the -same  probability  law  as  W,  and  (21) 
from  the  reflection  principle  of  the  Wiener  process.  ■ 

It  should  be  pointed  out  that  a  tighter  upper  -bound  is 
possible  by  using  the  first  passage  times  for  the  process  {A0„  t 
€  [0,  71},  whose  distributions  are  known  [9],  or  by 
strengthening  the  inequality  in  (19)  using  the  first  passage  time 
of  Brownian  motion  out  of  a  symmetric  interval  about  the 
origin  [10],  [11].  However,  the  easy  upper  bound  used  in  the 
proof  suffices  for  our  needs. 

Lemma  2:  Let  /  be  an  arbitrary,  nonnegative  integer.  Lower 
bounds  on  the  conditional  error  probabilities  are 


1-4  Q 


-£[P[/V(aAoy)-s/|{A0„  Os/sf}]] 


sE[P[N(Ao)>l\{M„  OztaT})) 


where  1  -  ax1/ 2  2:  cos  x  V|x|  E  [0,  *,],  and 


Applying  Lemma  1,  we  have  proved  the  first  bound  of  this 
claim.  To  prove  the  second  bound,  we  recognize  the  fact  that 
A'i  <  2f,  and  that  ttJe.  cdf  of  N( A)  is  a  monotonically 
deceasing  function  of  A.  '  ,  '  ■ 

As  a  direct  result  of  Lemma2  a  lower  bound  on  the  total 
probability;  of  error  is 


+2  .j; 


J  ikmQ 


ak  .(]k 


*-0. 


k\  dv* 


<2P[error]  (22) 


where  #(u)sis  as  defined- earlier,  and  l  is  an  arbitrary  non¬ 
negative  integer.  From  (17)  as  well  the  fact  that  the  Poisson 
cdf  is  a  decreasing  function  of  the  mean,,  an  upper  bound  on 
the  total  error  probability  is 


2P[error]  s  1  +  £  i  [««**(  -  „)  -*(«,)],_  . , .  (23) 

*To  dv 

Both  (22)  and.  (23)  depend  on  $(v),  which  is  computed  in  the 
Appendix. 

In  the  next  lemma,  we  show  that  the  bounds  in  (22)  and  (23) 
converge  as  y  00. 

Lemma  3:  Let  /  be  an  arbitrary,  nonnegative  integer. 

11111  1  +  £  [d2fl,$(-«>)-$(u)] 

»-•  "0  k\  dv* 


=  lim  1-4Q 


*-0 


/  a*  d* 


Proof:  We  rewrite  (22)  as 


s  ^  (2n*e-Jrs£[P[N(A1)s/|{A0„  0 stzT})). 

Proof:  Let  lA  be  the  indicator  function  of  event  A. 
Then 

£[P[(V(Ao)>/|{A0„O^/^r}]] 

=  l-£[P[A/(Ao):S/|{A0„  0:S/:S7}]} 
*l.-£[P[/V(Ao)i/|{A9„  0*/;s7}J 

’  {l(|ai/isxa,v/e(0ln}  +  1(Jt€i0.n:ia#,l>xa}}3 
1  -  E[P[N(aA%)  £l\{Ad„Q£tsT}) 

‘  1(1^1  sxfl,vr6i0.n}] 

-P[3/  6  [0,  7] :  |A0,|>x.) 

.  l  -  ElP[N(aAtf)  *  / 1  { A0„  0  £  ts  7}]] 

-'far  €  [0,  T ] :  [A0,|>x.] 
vnere  1  -  ax1/ 2  -  cos  x  2:  0  Vlx|  E  [0,  x.]. 


-  ElP[N(aA%)  s* / j { A0„  0 s  ts  7}]]  £ 2P[error]. 
Taking  limits  of  both  sides  as  y  -*  <»,  we  have 
P[Wn*l]  + 1  -  lim  £[P[/V(aA^)^/|{A^,  0^/rS  7}]] 

y-m 

s2P[error]. 

The  last  two  terms  cancel  by  an  application  of  the  bounded 
convergence  theorem,  and  the  continuity  of  the  Poisson  cdf 
with  the  mean.  The  same  result  follows  from  the  limit  of  (23) 
by  similar  arguments.  ■ 

Since  the  upper  and  lower  bounds  result  by  replacing  the 
mgf  of  l/T\% cos  AO,  dt  by  that  of  l/7[^ (1  -  A0?/2)  dt,  it  is 
of  interest  to  explore  the  difference  between  the  mgf’s  of  the 
two  random  variables.  In  Fig.  2,  we  compare  the  mgf  of  1/7 
[ o  cos  Ad,  dt  via  Monte  Carlo  simulation  to  the  theoretical  mgf 
of  1/7  [£  (1  -  A8j/2)  dt  for.-y  =*  30,  which  is  conservative 
well  below  the  range  of  values  of  interest  in  the  analysis  of 
the  Droposed  detector.  The  theoretical  mgf  of  1/7  [0r  (1  - 
i*0f/2)  dt  begins  to  differ  from  the  mgf  of  1/7  j  J  cos  Ad,  dt 
obtained  bv  Monte  Carlo  simulation  at  u  =  -  6.  As  seen  in  the 
Appendix,  this  sharp  rise  of  the  mgf  is  due  to  a  branch  point  in 
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Fig.  2.  Moment  generating  functions  (y  ±=  30). 


Fig.  3.  Error  rate  bounds  and  simulation  results  (6  pbotoos/bii). 

the  infinite  product  expression  of  the  mgf.  As  y  increases,  this 
branch  point  occurs  for  smaller  values  of  v.  Fortunately,  we 
are  concerned  with  the  region  v  E  [  -  1 ,  1]  where  we  evaluate 
the  mgf  for  the  error  probability  bounds. 

Fig.  3  presents  the  .upper  and  lower  bounds  on  the  error 
probability  of  the  test  in  (10)  for  an  optical  power  of  6  photons 
per  bit.  Also  shown  are  results  of  Monte-Carlo.simulations  of 
the  hypothesis  test.  It  appears  from  the  simulation  results  that 
the  upper  bound  is  tighter  than  the  lower  bound.  This  is 
because  the  lower  bound  on  E[P[N(A\)  £  /|  {Adt,  0  S  I  £ 
■H))  obtained  in  Lemma  2  is  derived  by  trivially  upper 
ounding  cos  ( x )  by  unity  for  |jr|  >  x,.  In  Fig.  3,  it  can  also  be 
:een  Utat  the  error  Drobability  bounds  are  discontinuous 
.unctions  of  y.  These  discontinuities  occur  for  values  of  (f,  7) 
vnere  the  suboDtimal  threshold,  given  by  the  RHS  of  (10), 
manges  value.  Indeed,  these  discontinuities  result  from  the 
>se  of  a  suboptimal  threshold.  As  the  LHS  of.  (10)  is  integer- 
■aiucd.  it  is  straightforward  to  optimize  the  threshold  so  as  to 
runimize  either  the  upper  or  lower  bounds.  Because  of  the 
lghtness  exhibited  by  the  upper  ■  bound,  we  choose  the 
hreshold  function  which  minimizes  (23).  Additionally,  since 
he  Drocess  {X(/),  t  €  (0,  T]}  is  close  to  the  mean  for 

uoacrate  y,  the  suboptimum  threshold  function  in  (10)  is 
uuai  to  the  optimized  threshold  function  except  in  .very  small 
intervals  in  the  range  of  interest  of  7.  In  Fig.  4,  we  have 
displayed  the  lower  envelope  of  the  upper  bounds  correspond¬ 
ing  to  all  integer  thresholds,  (which  is,  obviously,  an  upper 
>ound  to  the  error  probability  of  the  test  obtained  with  the 
jpumum  threshold)  together  with  the  lower  bound  computed 
it  the  threshold  that  minimizes  the  uDper  bound  for  several 
’aiues  of  {*.  That  is,  Fig.  4  displays  (22)  and  (23)  replacing  /  by 
■he  ODtimized  threshold  function. 
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Fig.  4.  (a)  Error  probability  bounds  with  optimized  threshold,  (b)  Error 
probability  bounds  with  optimized  threshold. 


Fig.  5.  Bounds  on  the  power  penalty  of  proposed  DPSK  receiver. 


The  power  penalty  is  an  alternative  way  to  characterize  the 
receiver  performance.  Fig.  5  shows  bounds  on  the  power 
penalty  of  the  proposed  DPSK  receiver  at  10~9  BER.  These 
curves  were  obtained  by  recording  the  values  of  7,  for  fixed 
optical  energy,  at  which  the  lower  bound  (22)  and  upper  bound 
(23)  were  equal  to  10'9.  The  optimized  threshold  was 
employed  for  these  curves  as  well.  It  should  be  noted  that  the 
lower  bounding  curve  in  Fig.  5  is  a  smooth  lower  bounding 
envelope  to  the  power  penalty  data.  By  comparison,  the  power 
penalty  for  the  balanced  DPSK  receiver  as  described  in  (2]  is 
always  greater  than  3  dB,  and  attains  this  value  only  as  7  -*■ 
».  As  Fig.  5  illustrates,  the  power  penalty  for  the  quantum- 
limited  DPSK  receiver  is  below  3  dB  for  7  >  700. 

IV.  Summary 

In  this  paper,  we  have  analyzed  the  error  probability  of  an 
asymptotically  quantum-limited  direct-detection  DPSK  re¬ 
ceiver.  The  receiver  consists  of  a  delay-and-sum  optical 
preprocessor  in  tandem  with  a  photoelectric  converter  and  an 
integrate-and-dump  circuit.  The  output  is  initially  compared  to 
a  suboptimal  threshold  that  was  derived  under  the  assumption 
that  the  conditional  rates  are  constants.  We  tightly  bounded  the 
error  probability  for  arbitrary  thresholds  by  developing  upper 
and  lower  bounds  on  the  conditional  intensities  of  the  photon 
point  process  at  the  photodetector.  Prompted  by  the  tightness 
of  the  upper  bound,  we  then  improved  the  receiver  perform¬ 
ance  by  minimizing  the  upper  bound  over  all  integer  threshold 
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-levels.  This  optimized  threshold  coincides  with  the  RHS  of 

(10)  except  in  very  small 'intervals  in  the  range  of  7.  The 
power  penalty  bounds  were  computed  using  the  optimized 
threshold  function,  and  appear  in  Fig.  5.  While  the  balaficed 
DPSK  receiver  analyzed  in  [2]  had  a  power  penalty  greater 
than  3  dB,  the  receiver.presented  here  had  a  power  penalty  less 
than  3  dB  for  reasonable  values  of  7. 

Appendix 

Moment  Generating  Function  of  1/27* J  J  Ad]  dt 

In  this  section,  we  find  an  expression  for  the  moment 
generating  function  of  the  random  variable  \/2T  jj  Ad]  dt.  A 
more  general  problem  has  been  solved  previously  [12]. 
Consider  the  random  process  {z,  A  \'0x]h(t,  r)  dr,  t  6  B+} 
where  {xa,  a  £  8S+}  is  a  zero  mean,  wide-sense  stationary 
Gaussian  random  process  with  autocorrelation  function  Rif). 
Then  the  moment  generating  function  MZl(v)  is  given  in 
infinite  product  form  as 


1 

yJl-2v\; 


(A.l) 


where  (X;,  /  =  1,  2  •••}  are  obtained  by  solving  the 
homogeneous  integral  equation 


\  R{r-a)h{t,  f)<f»i(f)  dr.  (A.2) 
Jo 


For  our  particular  case,  we  have 


/=7 


h{T,  t)» 


IT 


and  Rif)  as  in  (8).  By  substituting  these  equations  into  (A.2), 
we  find  for  0  <  a  <  T 


('  {T-\a-r\)4>M  dr.  (A.3) 

7  Jo 

Similar  equations  result  for  other  values  of  a.  Taking  the  first- 
and  second-partial  derivatives  of  (A.3)  with  respect  to  a  we  get 


X/d/(d)=-  f  sgn  {r-a)4>i{r)  dr  0 <a<T  (A.4) 

7  Jo 


and 


2x 

<pt(o)= — r  4>i(v)  0<o<T.  (A.5) 

7  N 

Equation  (A.5)  suggests  the  general  solution 

4>i(<i)=Ai  cos  o)tff+Bt  sin  u, a  0 <c<  T.  (A,6) 

If  we  substitute  (A.6)  into  (A.3)  and  (A.4)  to  solve  for  the 
unknowns  A/,  w /,  and  Bi,  we  find  that  (X*,  i  -  1,  2  •  • •}  are 
the  solutions  to 


1 

2 


fex  1  .  fex  1  . 

[ - sin - =  1  +  cos 

7  X,  h 


Included  among  these  eigenvalues  are  {2r/{y[2n  +  l]2x2),  n 
=  0,  1,2  •  •  • }.  The  remaining  portion  of  the  eigenvalues  are 


close  to,  but  not  exactly  {2x/(7[2«t]2)  ,rt  =  1,2  •••}.  Now 
that  the  eigenvalues  are  known,  the  moment  generating 
function  of  1/2 T {J  Ad]  dt  may  be  found  by  (A.l). 
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Multiple- Access  Channels  with  Memory 
with  and  without  Frame  Synchronism 

SERGIO  VERDU,  senior  member,  jeee 


Abstract — The  capacity  region  of  frame-synchronous  and  asynchronous 
discrete  two-ustr  multiple- access  channels  with  finite  memory  is  obtained. 
Frame  synchronism  refers  to  the  ability  of  the  transmitters  to  send  their 
codewords  in  unison.  The  absence  of  frame  synchronism  in  memoryless 
multiple-access  channels  is  known  to  result  in  the  removal  of  the  convex 
hull  operation  from  the  expression  of  the  capacity  region.  We  show  that 
when  the  channel  has  memory,  frame  asynchronism  rules  out  nonstation¬ 
ary  inputs  to  achieve  any  point  in  the  capacity  region,  thereby  allowing 
only  coding  strategics  that  involve  cooperation  in  the  frequency  domain  but 
not  in  the  time  domain.  This  restriction  drastically  reduces  the  capacity 
region  of  some  multipie-access  channels  with  memory,  and  in  particular  the 
total  capacity  of  the  channel,  which  is  invariant  to  frame  asynchronism  for 
memoryless  channels. 

I.  Introduction 

THE  CENTRAL  result  in  multiuser  information  theory 
states  that  the  capacity  region  of  a  two-user  discrete 
multiple-access  channel  is  equal  to  the  convex  closure  of 
the  set  of  rate  pairs  (R,,  R2)  satisfying 

0^R^I(X\Z\Y)t 

0 zR2<;I(Y;  Z\X), 

Rj  +  R2£l(X,  Y\Z)  (1) 

for  some  independent  input  distributions  X  and  Y  and 
output  distribution  Z.  This  result  was  obtained  by 
Ahlswede  (1]  in  1971  under  the  key  assumptions  that  the 
channel  is  memoryless  and  frame  (or  block>synchronous, 
i.e.,  the  beginnings  of  the  codewords  sent  by  the  transmit¬ 
ters  coincide.  In  the  absence  of  frame  synchronism,  an 
unpredictable  offset  exists  between  the  epochs  at  which  the 
codewords  of  each  user  are  received  at  the  decoder.  Even 
though  the  receiver  can  easily  acquire  timing  synchronism 
with  each  user  and  hence  know  the  value  of  the  offset  prior 
to  decoding,  the  transmitters  must  encode  their  messages 
without  knowing  the  offset  between  their  codewords.  As¬ 
suming  that  the  offset  can  be  guaranteed  to  be  negligible 
with  respect  to  the  codeword  length  (e.g.,  if  an  upper 
bound  on  the  offset  is  known  by  the  transmitten),  Cover 
et  at.  (5J  and  Narayan  and  Snyder  [15]  proved  that  the 
capacity  region  and  the  cutoff  rate  region,  respectively,  of 
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the  discrete  memoryless  multiple-access  channel  are  the 
same  as  in  the  frame-synchronous  case.  Poltyrev  (18]  and. 
independently,  Hui  and  Humblet  (12)  have  shown  that  if 
no  information  on  the  actual  value  of  the  offset  is  available 
to  the  transmitters  (i.e.,  if  the  channel  is  “completely” 
frame-synchronous),  then  the  capacity  region  of  the  dis¬ 
crete  memoryless  multiple-access  channel  is  as  stated  above 
but  without  the  convex  hull  operation.  This  implies  that  in 
most  memoryless  channels  of  interest,  frame  asynchronism 
does  not  change  the  capacity  region,  the  most  well-known 
exceptions  being,  perhaps,  the  Massey-Mathys  collision 
chan  el  without  feedback  (13}  and  the  counterexamples  in 
(6,  p.  287]  and  (3J.  Furthermore,  the  Poltyrev-Hui- 
Humblet  result  implies  that  the  maximum  achievable  rate 
sum  (Rj  +  Rj)  (or  total  capacity)  of  the  memoryless  multi¬ 
ple-access  channel  is  never  decreased  by  the  lack  of  frame 
synchronism,  because  the  rate  sum  of  any  convex  combi¬ 
nation  of  rate  pairs  is  equal  to  the  convex  combination  of 
the  respective  rate  sums.  As  we  shall  see,  these  conclusions 
are  no  longer  true  when  the  multiple-access  channel  has 
memory. 

Even  though  the  study  of  the  capacity  of  single-user 
channels  with  memory  has  occupied  a  prominent  position 
L;  the  development  of  the  Shannon  theory,  multiple-access 
channels  with  memory  have  received  scant  attention  in  the 
literature  (see  van  der  Meulen  (14,  open  problem  12]). 
Aside  from  their  inherent  conceptual  and  practical  inter- 
,  est,  multiple-access  channels  with  memory  play  a  key  role 
in  the  modeling  of  symbol-asynchronous  channels.1  These 
are  continuous-time  channels  where  each  codeword  sym¬ 
bol  modulates  a  finite-duration  signal  waveform  and  the 
transmitters  do  not  cooperate  so  that  the  symbol  epochs 
are  aligned  at  the  receiver.  Since  each  symbol  overlaps 
with  two  consecutive  symbols  transmitted  by  the  other 
user,  the  equivalent  discrete-time  multiple-access  channels 
required  to  model  symbol-asynchronous  channels  have 
memory  [21].  Therefore,  the  study  of  multiple-access  chan¬ 
nels  where  the  transmitters  are  completely  asynchronous 
leads  to  frame-asynchronous  discrete-time  multiple-access 
channels  with  memory. 


'One  more  type  of  channel  “asynchronism”  is  that  which  allows 
deletiotu  and  insertions  of  symbols  at  locations  unknown  to  the  decoder. 
This  has  been  studied  by  Dobrushin  (7]  and  by  Ahlswede  and  Cacs  (2]  in 
the  context  of  single-user  and  multiple- access  memoryless  channels,  re¬ 
spectively. 
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The  multiple-access  channel  with  memory  studied  in  this 
paper  has  finite  input  ..alphabets  Ax  and  A2  and  finite 
output  alphabet  B.  Except  for  the  general  converse  theo¬ 
rem  for  synchronous  channels  proved  in  Section  II,  our 
results  are  obtained  under  the  assumption  that  the  multi¬ 
ple-access  channel  is  stationary  and  has  finite  (input) 
memory,  in  the  sense  that  each  channel  output  depends  on 
up  to  m  consecutive  inputs  of  each  user,  and  the  outputs 
are  conditionally  independent  given  the  inputs.2  The  mul¬ 
tiple-access  channel  with  finite  memory  encompasses  many 
cases  of  practica!  interest  such  as  the  symbol-asynchronous 
channel  and  channels  with  finite-length  intersymbol  inter¬ 
ference;  its  capacity  has  been  solved  in  the  single-user 
case  in  the  works  of  Tsaregradsky  [20],  Feinstein  [8],  and 
Wolfowitz  [22]. 

As  usual  when  dealing  with  sources  or  channels  with 
memory,  the  capacity  region  of  the  multiple-access  channel 
with  memory  does  not  admit  single-letter  characterizations 
and,  rather,  is  given  in  terms  of  a  limit  of  regions.  This  fact 
does  not  curtail  the  applicability  or  interest  of  these  results 
because  those  limits  are  computable,  as  we  show  in  several 
examples  where  they  result  in  explicit  closed-form  expres¬ 
sions.  Moreover,  we  provide  a  theorem  (which  generalizes 
'.Volfowitz’s  result  122,  theorem  5.5.1]  on  the  speed  of 
•onvergence  ot  single-user  capacity)  that  allows  the  com¬ 
mutation  of  the  capacity  region  of  the  channel  with  mem- 
>rv  up  to  any  desired  degree  of  approximation  via  the 
omrmiation  of  achievable  regions  for  memoryless  chan- 

iClS. 

As  in  the  case  of  the  memoryless  multiple-access  chan¬ 
nel,  the  frame-synchronous  channel  with  memory  is  shown 
to  satisfy  the  time-sharing  principle,  i.e.,  its  capacity  region 
is  convex.  As  a  form  of  cooperation  in  the  time  domain, 
time-sharing  requires  nonstationary  input  distributions. 
Note  that,  while  stationary  inputs  always  achieve  capacity 
in  time-invariant  single-user  channels,  there  are  time- 
invariant  multiple-access  channels  (e.g,  the  aforemen¬ 
tioned  channels  whose  capacity  region  is  decreased  with¬ 
out  frame-synchronism)  that  require  nonstationary  inputs 
to  achieve  all  points  in  the  capacity  region.  In  this  paper 
we  show  that  only  stationary  inputs  are  allowed  for 
frame-asynchronous  multiple-access  channels  with  mem¬ 
ory.  Hence  cooperation  betw.een  the  users  is  beneficial  in 
the  frequency  domain  (dependent  inputs  are  necessary  to 
achieve  capacity  because  the  channel  has  memory)  but  not 
in  the  time  domain  due  to  the  lack  of  a  common  time 
eterence.  The  ooposite  situation  is  encountered  in  the 
rame-svnehronous  memoryless  multiple-access  channel, 
vnere  it  is  enough  to  restrict  attention  to  independent 
nDut  sequences  and  time-sharing  (hence  nonstationary) 
nnuts  may  be  required  to  achieve  capacity.  In  the  light  of 
■ur  results,  the  Poltyrey-Hui-Humblet  result  for  memory- 
ess  channels  is  a  consequence  of  the  nonstationarity  of 
ime-sharinx  strategies. 

"Tie  results  and  oroof  techniques  of  this  paper  easily  (eneralise  to  the 
ase  when  the  outputs  are  conditionally  m-depcnden(  liven  the  inputs, 
:.e„  when  all  pairs  of  subsett  of  random  variables  whqse  indices  differ  by 
more  than  m  are  independent 


The  actual  impact^  that  the  lack  of  frame  synchronism 
(i.e.,  the  restriction  to  stationary  inputs)  has  on  the  capac¬ 
ity  region,  of  the  multiple-access  channel  with  memory  is 
quite  diverse.  On  one  hand,  there  are  many  frame-synchro- 
nous  channels  (e.g,  the  symbol-asynchronous  multiple- 
access  channel  considered  in  (21))  whose  capacity  regions 
are  achieved  by  stationary  inputs,  and  therefore,  they  do 
not  decrease  if  the  users  are  not  guaranteed  to  transmit 
their  codewords  in  unison.  On  the  other  hand,  we  show  in 
this  paper  the  existence  of  channels  with  memory  where 
hot  only  the  capacity  region  but  the  total  capacity  is 
drastically  reduced  by  the  lack  of  frame  synchronism. 


II.  Frame-Synchronous  Capacity  Region 

We  give  first  a  general  converse  coding  theorem  for  the 
discrete  frame-synchronous  multiple-access  channel  that 
puts  no  restrictions  on  its  transition  probabilities. 

Theorem  1:  The  capacity  region  of  the  discrete  frame- 
synchronous  multiple-access  channel  satisfies5 


where 


C  c  closure 


liminf-C. 
* — oo  n 


(2) 


y »y*  v 

0^R2^I(Yn-,  Zn\Xn) 

Rl  +  *iS/(X\y";Z*)}  (3) 

and  the  union  is  over  independent  n-dimensional  input 
distributions.  Note  that  the  convex  closure  of  C„  is  the 
capacity  region  of  the  discrete  memoryless  multiple-access 
channel  whose  input  and  output  alphabets  are  A ",  A],  and 
B ",  respectively,  and  whose  transition  probabilities  are 

Fz*l  *•.>"• 

Proof:  We  need  to  show  that,  for  all  0  <  <  <  1,  every 
<-achievable  rate  pair  (/?,,  R2)  belongs  to  the  right  side  of 
(2).  If  (Rx>  R2)  is  c-achicvable,  then  for  all  y  >  0  and  for 
all  sufficiently  large  n  there  exists  an  (n,  Mx,  M2,<)  code 
(i.e.,  a  code  with  block  length  n,  Mi  codewords  for  user  i. 
and  average  probability  that  both  messages  are  correctly 
decoded  greater  or  equal  than  I  -  <)  such  that 

]-^*Rr  y,  /- 1,2.  (4) 

n 

Fix  one  such  code  and  let  Sx  and  S2  denote  independent 
random  variables  uniformly  distributed  on  (1 
and  {!,- •  M2).  The  message  transmitted  by  user  /  is  a 
realization  of  S,.  Let  Z*  denote  the  output  of  the  channel, 
when  Sx  and  Sj  are  transmitted  using  the  above 
./»,  MuM2,e)  code,  and  let  (SX,S2)  be  the  messages  se- 


’Atl  th«  logarithms,  exponentials,  entropies  and  mutual  informations  in 
this  paper  have  a  commoo  arbitrary  base. 
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lected  by  the  decoder.  The  Fano  inequality  states  that 
H(Slt  S2|Z")  <;  log2‘+P[(S1,  Si)  +  ($,£)! 

log  MxM2  (5a) 

//(S1|Z*)^log2+F[51Y‘S1]logiV1  (5b) 

^(^IZ*)  ^log2  +  i>[52^52]log3/2-  (5c) 


Since  the  average  probability  of  error  of  the  code  does 
not  exceed  t,  the  probabilities  in  the  upper  bounds  of  (5) 
can  be  replaced  by  <,  and  because  Sx  and  S2  are  uniformly 
distributed,  we  can  write 

/(Sl;Z-)i(l-«)tog^l-log2  (6a) 

l(S2\  Z")  £  (1  -<)logA/2  -log2  (fib) 

/( S„  S2 ,  Z *)  ^  (1  -  c)  log  MxM2  -  log 2.  (6c) 

If  {1,-  •  •,  A/)}  -* Af  denotes  the  code  book  of  user  /, 
then  since  Sx  and  S2  are  independent, 

iiMSjiZViiSi))  »/(/i(5|);/2(^)) 

+  /(/,(51);Z-|/2(S2)) 

-/(/i($);z\/2($)> 

zii/As^z") 

*/(Si;Z")  (7) 


where  the  last  inequality  follows  from  the  data-processing 
lemma.  In  a  similar  way,  we  obtain 

zV,(«i))*/(Si:Z")  («*) 

z-)  2  /(svs,;  Z").  (8b) 

Now,  putting  (4)  together  with  (6)— (8),  we  get 

(1  -<)(*!-  Y)  -  S  W(/i(Si);  Z"l/:(52)) 

n  n 

(l-O(^-r)--— ^;/(/:(5z);Z"l/t(^)) 

n  n 

(l-<)(J?1  +  J?2-2y  )-^Zl-l(fl(Sl),f2(S2);Z') 

ft  n 


which  implies  that 

.  .  .  /  log2  log2  \  1 

(1  -  <)(  R,  r.  R,-y)A  (— ,  -2-  j  e  -  C, 

for  all  sufficiently  large  n,  and  consequently, 

(l-c)(i?l-2Y,/I>-2Y)e^C, 

fl 

or  all  sufficient  large  it,  or  in  other  words 


the  same  generality  as  Theorem  1.  We  will  henceforth 
focus  our  attention  on  the  following  class  of  multiple-access 
channels  with  memory. 

Definition:  A  multiple-access  channel  with  finite  mem¬ 
ory  m  is  one  whose  channel  transition  probability  satisfies 

Pz*ixm.Ym(wv '  *  * » H»lai» 

*  Fz,.-.  z,.lir,.-.  xm.,.  y,. r.., 

n 

nPe(»i\a,-^-,al,bl_m^-:bl)  (10) 

i  «■  m 

for  all  n  >  0. 

This  implies  that  the  outputs  Zm>*  •  •,  Z„  are  condition¬ 
ally  independent  given  the  inputs,  and  each  of  them  de¬ 
pends  oh  m  consecutive  inputs  of  each  user,  thus  encom¬ 
passing  intersymbol  interference  of  finite  duration.  This 
definition  allows  us  to  handle  the  boundary  outputs 
Zx,'  ••,  Zm-1  (which  depend  on  fewer  than  m  input  sym¬ 
bols  from  each  user)  in  any  arbitrary  way,  and  it  L 
therefore  preferable  to  the  single-user  definition  of  [8]  and 
[22]  where  the  boundary  outputs  are  not  available  to  the 
decoder..  As  we  shall  see,  the  capacity  region  of  the  multi¬ 
ple-access  channel  with  memory  depends  only  on  the  tran¬ 
sition  probability  pc  and  not  on  the  conditional  distribu¬ 
tion  of  the  first  nt  -1  outputs. 

In  what  follows  it  is  convenient  to  refer  to  a  memoryless 
multiple-access  channel  derived  from  the  channel  with 
memory  in  the  following  way. 

Definition:  Let  I'zm.  The  /-block  multiple-access  chan¬ 
nel  derived  from  a  multiple-access  channel  with  finite 
memory  m  is  a  memoryless  channel  characterized  by  input 
alphabets  A[  and  A2,  output  alphabet  and  transi¬ 

tion  probability 

f((  %,»* ' '» W/)K«i,‘  «/).(V  • ' .  b,)) 

t 

-  (ii) 

i 

It  follows  from  (1)  that  the  capacity  region  of  the  /-block 
memoryless  multiple-access  channel  is  equal  to  the  convex 
closure  of 

Qr  U  {(Jt|.*,):0s*ls/(*';Z',|J'') 

x4.* 

0^R2^j(Y‘;Zlm\X‘) 
til  +  R2^I[Xl,Yl-,Z‘m)}  (12) 


l-e)(B1-2y,  B2-2y)  €  liminf-C,.  (9) 

.  ~*oo  n 

iowever,  since  e  and  y  are  arbitrarily  small,  (9)  implies 
hat  (Rv  R2)  must  be  the  limit  of  a  sequence  of  points 
■elonging  to  liminf,--(l/n)C„  and  therefore  it  belongs 
o  the  right  side  of  (2)  (as  was  to  be  shown). 


where  the  union  is  over  independent  distributions  on  the 
sets  A\  and  A2,  respectively,  and  Z‘m-(Zm,---,Z,).  The 
direct  coding  theorem  for  the  multiple-access  channel  with 
finite  memory  gives  the  following  achievable  region  as  a 
function  of  the  achievable  region  in  (12)  for  the  /-block 
multiple-access  channel. 


\s  in  the  case  of  the  single-user  channel  with  memory, 
no  universal  direct  coding  theorem  is  known  to  hold  with 


Theorem  2:  The  capacity  region  of  the  frame-synchro- 
nous  multiple-access  channel  with  finite  memory  m  satis- 


-08 

les 

: 3  closure!  U  tC,]-  (13) 

'roof:  We  need  to  show  that,  for  all  l>m  and 
Rx,  R2)  e  (1//)C/  and  l°r  every  0  <  <  <  1  and  y  >  0,  there 
xisi  in,  M„  M2,f)  codes  for  all  sufficiently  large  n  such 
nat 
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lower  rates  than  the  original  code  with  block  length  kt,  the 
decrease  is  inappreciable  for  large  k. 

The  following  result  proves  that  the  inner  and  outer 
Sounds  shown  in  Theorems  1  and  2  coincide. 

rheorem  3:  The  capacity  region  of  the  frame-synchro¬ 
nous  multiple-access  channel  with  finite  memory  m  is 
given  by 


hen  U,>m(l//)<2,  will  be  an  achievable  region  and  so 
viil  its  closure  since  the  caoacity  region  is  a  closed  set. 

^irst.  we  will  fix  /  and  show  the  existence  of  said  codes 
or  sufficiently  large  multiples  of  /:  n  -  kl.  Since  Q,  is  an 
:cnievable  region  of  the  /-block  memoryless  multiple-access 
hannel.  if  ( R[,  R2)  e  Q„  then  for  every  y,  >  0  and  all  k 
urficientlv  large,  there  exist  (k,  Mx,  M2,t)  codes  for  the 
-block  channel  such  that 


closure 


liminf-en 
a—®  n 


■  closure 


( Iimsup  -Cn)  -closure/ liminf -C„).  (17) 

\  n  I  \ n  I 


Proof:  The  essence  of  the  proof  is  the  following  in¬ 
equality  which  holds  for  all  /  ^  m: 

/  Y1: Z')  •l{Xl,Y,i  Z'm)  +  l{X‘,  Y‘;  Z-l| Z'm) 

S/(X',y';Zj,)  +  (m-l)log|B|  (18a) 


-/-R'-y,,  /- 1,2.  (14) 

4ow.  we  fix  one  such  code  and  view  the  symbols  in  each 
ji  its  codewords  sequentially.  In  this  way,  we  have  a  code 
or  the  multiole-access  channel  with  memory  with  block 
ength  kl  and  M,  codewords  for  user  i.  Its  probability  of 
rror  is  not  greater  than  c  because  if  we  were  to  constrain 
ne  decoder  not  to  use  the  outputs 

1»  ‘  "* 

*'+«— 1»"  '  ’  >  ^(*-l)/+l>‘  ' '  >  ^(*-l)/+m-l> 

nen  the  situation  would  be  entirely  equivalent  to  decoding 
n  the  /-block  memorvless  channel  where  there  is  no  mter- 
erence  between  the  /-blocks.  Qearlv,  if  those  outputs  ire 
iot  discarded,  the  probability  or  error  cannot  increase. 
4ow  letting 


R[,R‘2) -/(*„  R2) :6  Q, 
14)  results  in 

ogM,  y 

:/  rS'-2 


(W) 


=s  we  wanted  to  show.  However,  this  only  proves  the 
xistence  of  reliable  codes  with  the  desired  rates  for  block 
engths  that  are  multiples  of  /.  To  find  codes  whose  block 
ength  is  n-W  +  r,  /*l,---,/-l,  we  append  t  arbitrary 
nout  symbols  to  each  of  the  codewords  of  the  foregoing 
kl,  Af„  M2,t )  codes,  and  let  the  decoder  discard  the  last  t 
stouts.  Then,  it  is  clear  that  the  probability  of  error 
emams  unchanged  and  the  rates  of  the  new  codes  satisfy 
via  (15)) 


og  M,  t  kl  y 

<i+t  ~R'~  ki+tR'  ~  kl  + 1  2  ’ 


/»1,2.  (16) 


For  sufficiently  large  k,  however,  tR,/( Ik  +  t)  <y/2  in 
-vnich  case  the  right  side  of  (16)  can  be  further  lower- 
■ounded  bv  R,  -  y.  Thus  even  though  the  new  code  has 


and  similarly, 

/( X1',  Z‘\Y‘)  <;  /(*';  Z‘m\Y‘)  +  (m-l) log |B|  (18b) 

!(Y';  Z'\X')  Z  l(Y']  z;|*')  +  (m-l)log|£|  (18c) 
which  imply  that 

c  (2/  +  (m  - 1)  log \B\U  (19) 

where  U  is  the  unit  square  {(xx,  x2):  0  £  xx  £  1, 0  £  x2  £  1). 
*Ve  now  have  the  following  chain  of  inclusions: 

closure  I  Iimsup  -C„]  c  closure!  Iimsup  -Q„ 

c  closure (  (J  -Q„ 

\nim  n 

-C 

C  closure  I  lim  inf  -  C„ ) 
l  a — oo  n  j 

C  closure  I  liminf  -  QH  |  (20) 

where  the  first  and  last  inclusions  follow  easily  from  (19) 
and  the  third  and  fourth  inclusions  are  Theorems  2  and  1, 
respectively.  Finally,  since  the  liminf  is  a  subset  of  the 
Iimsup  all  the  inclusions  in  (20)  are  in  fact  equalities. 

The  closure  operation  in  (17)  is  indeed  necessary  be¬ 
cause  even  if  Iim„ _„((?„/«)  exists,  it  may  not  be  a  closed 
set  (e.g.,  if  the  first  m-1  boundary  outputs  are  indepen¬ 
dent  of  the  inputs).  At  first  sight  it  may  seem  surprising 
that  the  capacity  region  of  Theorem  3  does  not  involve  an 
explicit  convex  hull  operation,  especially  in  light  of  the  fact 
mat  the  particular  case  of  the  frame-synchronous  memory¬ 
less  multiple-access  channel  is  known  to  require  the  convex 
hull  operation.  In  fact  the  capacity  region  of  Theorem  3  is 
already  convex  because  it  is  given  as  a  limit  of  achievable 
regions  for  n-block  channels  whose  input  distributions  are 
allowed  to  time-share  among  several  distributions  as  a 
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result  of  the  assumption  that  the  users  are  frame-synchro¬ 
nous.  This  is  formalized  iit.  the  following  result. 

Corollary  (Time-Sharing  Principle):  The  capacity  region 
of  the  frame-synchronous  multiple-access  channel  with 
finite  memory  is  a  convex  set. 

Proof:  It  follows  from  Theorem  3  that  the  capacity 
region  is  independent  of  the  conditional  probability  of  the 
(m-1)  boundary  outputs;  therefore,  we  can  prove  the 
corollary  for  any  arbitrary  choice  of  this  probability;  in 
particular,  we  shall  assume  that  Z,,-  •  -,  Zm_x  are  indepen¬ 
dent  of  the  inputs.  Then  we  can  follow  the  same  approach 
as  in  the  proof  of  the  time-sharing  principle  for  memory¬ 
less  channels  16]  which  juxtaposes  two  codes.  If  we  impose 
the  restriction  that  the  decoder  must  discard  the  leading 
(m  - 1)  output  symbols  of  each  of  the  two  blocks,  then  the 
decoding  of  the  new  code  is  decoupled  and  equivalent  to 
the  case  when  the  codewords  are  sent  individually.  There¬ 
fore,  the  error  probability  of  the  new  code  is  better  than 
the  sum  of  the  probabilities  of  error  of  the  two  component 
codes,  the  rate  pair  is  a  convex  combination  of  the  rate 
pairs  of  both  codes,  and  the  proof  proceeds  as  in  the 
memoryless  case  (6,  p.  272]. 

Theorem  3  can  easily  be  generalized  in  several  direc¬ 
tions.  For  example,  the  proof  of  both  the  converse  and  the 
direct  theorems  remain  essentially  unchanged  for  continu¬ 
ous-alphabet  channels  with  input  constraints.  Another 
generalization  which  is  of  interest  in  the  symbol-asynchro¬ 
nous  channel  [21]  is  that  of  a  compound  multiple-access 
channel  where  the  transmitters  only  know  that  the  channel 
belongs  to  an  uncertainty  set  T  (cf.  [6,  p.  288]  for  the 
corresponding  memoryless  result).  In  that  problem,  the 
proof  of  the  direct  theorem  requires  very  little  modifica¬ 
tion  since  the  construction  of  codebooks  therein  is  inde¬ 
pendent  of  the  channel,  and  the  proof  of  the  converse  only 
needs  to  take  care  of  the  fact  that  a  good  code  must  be  so 
for  any  possible  channel  in  the  uncertainty  set.  Then 
Theorem  3  holds  by  replacing  C„  by 

Q«  U  D  {(Rl,R2):0zRlZl{Xl';Z''(u)\Y') 
x*y*  u«r 

0^KjS/(y";Z"(«)|X") 

Ri+R2<;I(r,Y'-,Z'(u))} 

where  Z"(«)  is  connected  to  X *  and  Y "  through  channel 

«<=r. 

For  the  nurposes  of  illustration  we  will  show  several 
xamDies  where  the  limits  of  Theorem  3  are  explicitly 
■omouiable.  However,  in  cases  without  much  structure  an 
utemative  to  the  analytical  computation  of  those  limits  is 
heir  numerical  anproximatioa.  This  can  be  done  using  the 
olio  win  a  theorem,  which  allows  the  computation  c!  the 
capacity  region  as  accurately  as  desired  via  the  computa¬ 
tion  of  achievable  regions  for  memoryless  channels.  Theo¬ 
rem  4  is  a  generalization  of  the  single-user  result  obtained 
by  Wolfowitz  (22,  theorem  5.5.1J.4 

‘Added  in  proof:  Theorem  4  gives  *  solution  to  Problem  l  in  (23). 


Theorem  4:  The  capacity  region  of  the  frame-synchro¬ 
nous  multiple-access  channel  with  finite  memory  satisfies 
for  every 


closure  |convexy^j 


/  1  \  m-1 

c  C  c  closure  ^convex - Q,  j  +  — —  log  (Bji/  (21 ) 

where  U  is  the  unit  square  {(x„  Xj):  0  £  xv  £  1. 0  £  .t2  £  1 }. 

Proof:  The  inner  bound  is  a  consequence  of  Theorem 
2  and  the  corollary  to  Theorem  3.  To  show  the  upper 
bound,  fix/i«  and  notice  that  for  any  n  =  kl.  Xn .  and 

y", 

/(*\y";Z") 

w(*\y^z:z>'+r.-z‘Vt>/) 

+i(x\Yn-,  zrlz::rl 

y  . . .  7(*-l)/+«-l|7/  . . .  yki  \ 

<;fc(m~l)log|B| 

+  l(X\Y'-,Z'm---Z^{k_  „,) 
~*(m-l)log|B|+tf(z^--Z;%_J 
-H(Z'm'--ZH_l)l+m\X\Y") 

Sk(m-\)\og\B\+  t  H(z'myt) 

y-o 

-H(zi,..-z**'.1)/+j^.y") 

-A(m-l)log|B| 

+ k£  { n(  z'mi'ji)  -  h(  z'mXi\xi:j,l  y/tf ) ) 

j-o 

*fc(m-l)log|fl| 

+  I^(*,%y,%«4)  (22) 

y-o 

where  the  next-to-last  identity  follows  from  the  definition 
of  the  channel  with  finite  memory.  Similarly,  we  can 
upper-bound 

I(X*’,  Z"|y")  £ k(m-l)log|B| 

+  1*  /(A/t^ZiYyiyi'#)  (23a) 
y-o 

l(Y";  Z*\X”)  £  k(m  -l)log|fl| 

+  *E1/(y/^;^, \xl$).  (23b) 

y-0 

However,  we  saw  in  the  proof  of  Theorem  1  that  if 
(RVR2)  is  r-achievable,  then  for  all  y>0  and  for  all 
sufficiently  large  n,  there  exist  (n,  Mx,  M2,t)  codes  such 
that 

— —  hR,~y,  i"1.2  (24) 
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and  for  some  input  distributions  X ”  and  Y", 

(1  -  c)  log  K'!og2  *  /( Xn\  Z"|y*)  (25a) 

(1  -  <)  log  M,  -  log2  *  /(y*;  Z"|  X”)  (25b) 

(1  -  «  )  log  Af,  M2  -  log2  £  /( X\  r-;  Z") .  (25c) 

Now,  combining  (22)-(25),  we  obtain  for  all  sufficiently 
large  k 

^[(1  -  OlogM,  -log2] -{m  -l)log|51 

*\kZi(xivi,\z%  My,) 

i[(l-t)logM:-log2]-(m-l)log|Bl 

*7 
K  ]•  0 

7  ((1  -  <)  log  M,M2 - log2]  -  (m  - 1)  log |B| 

^7  Z  i{xiv;,YM,z'm%,) 
k  j-o 

which  implies  that 

7(l-00°gMi.logA/2) 

/  I  log2  \ 

€  convex  { Q,}  + 1  -7—  +  (m  - 1)  log|fl|ll/ 

for  all  sufficiently  large  k.  This  together  with  (24)  implies 
that  any  c-achievable  pair  (/?,,  R2)  satisfies 

(l-«)(*i.*2) 

I  1  \  m-1 

s closure! convex  y<?/|  +  — j —  log|5|t/.  (26) 

Thus  if  (Rx,  R2)  is  an  achievable  pair,  then  it  must  belong 
to  the  closed  set  in  the  right  side  of  (26). 

The  following  examples  serve  to  illustrate  the  analytical 
evaluation  of  the  capacity  region  of  the  frame-synchronous 
multiple-access  channel  with  memory.  In  Section  III,  we 
derive  the  capacity  of  these  channels  in  the  absence  of 
synchronism. 

Example  1:  Consider  the  following  multiple-access 
channel  with  finite  (m»2)  memory  which  is  a  simple 
discrete-time  noiseless  model  of  two-user  duobinaiy  trans¬ 
mission:  A2m  (0,1),  B »  (0,1, 2,3,4) 

zl-x,  +  */-,  +  )’,  +  y,-i  (27) 

where,  according  to  Theorem  3,  it  i-  not  necessary  to 
specify  the  initial  conditions  as  far  as  computing  the 
capacity  region  is  concerned.  To  evaluate  C,  first  we 
compute  the  mutual  informations  in  the  definition  of  Q„ 
(12).  Since  the  outputs  are  deterministic  given  the  inputs, 


we  have 


I(Xv 

••.n;z2. 

••.zj 

m 

mzu-. 

zj 

(28a) 

I(X 1.- 

. .  X  •  7 
,  a2, 

- 

h(z2,..., 

zjv-. 

n) 

(28b) 

/(V 

••.n;z2,- 

••.z„  |Zlt- 

- 

ff(z2,---, 

z,iv 

(28c) 

Moreover,  the  properties  of  conditional  entropy  result  in 

ff(z2,-..,z,,i*l+yl) 

<;/f(Z2,---,Zj 

*  h(z2,-  •  • ,  z„\x2 + yi) +//(*,+ y,)  (29) 

and 

//(Zj.—.zjjfi+yo 

-  h(  z2\xx + y,) + h  ( Zji*,  +  y„  z2) 

+  •••  +  ^(zB|2f1+y„z2,---,z,.l) 
-/f(^2+y2i2f1+y1) 

+  H(X,  +  Y,\Xl  +  Ylt  X2  +  Y2) 

+  -.  +  /f(z^l+yl,-..,^.l  +  y,.l) 
-/f(2fl+yl,.-.,^+y,)-f/(Afl+yl).  (30) 

Also,  using  the  definition  of  the  channel  and  the  fact  that 
(Xx,-  •  •,  X„)  and  (Yi,*  •  •,  y„)  are  mutually  independent,  we 
can  write 

/f(z2,--,z,|yl,--,yj 

-  h(  2rx + jf2,-  •  • ,  x^x + x,\yv- •  • ,  y„) 

•H(XxA-X2,-,Xn.^Xn)  (31a) 

and,  similarly, 

7/(z2,...,z,!V*.^) 

-/f(yl+y2,---,n_l+yj.  (3ib) 

Now,  putting  together  (28)-{31),  we  obtain 
C  -  closure  ( lim inf -Q. ) 

-closure  lim  inf  U  (As.*): 

I  r-r ' 

QzRx£  —H(Xx  +  X2,’'",Xh„x  +  X„) 

ft 

QnRl*-H(Yl  +  Y2t-'-,Yl,-i+Yl,) 

ft 

lim  inf  IJ  j(J?j,  Jt2): 

ft 

0£R2a-H{YV'--,Yn) 

ft 

j?,+j?22;i/f(A'l+yl,---,2fJ,+yjj)  (32) 


verdCi:  multiple-access  channels  with  memory. 
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and  since  each  of  the.three  entropies  in  (32)  is  maximized 
by  independent  cquiprobjblfi.  inputs  (ma xxrH(X+Y) 
over  independent  binary  X  and  Y  is  equal  to  1.5  bit  and  is 
achieved  by  cquiprobable  distributions),  the  right  side  of 
(32)  is  equal  to  the  pentagon  C  -  (0  £  R,  £  1,  0  £  R2  £  1, 
/t,  +  A,Sl.5}. 

Example  2:  Let  =  A2m  {0,1,2}  and  B  »  {1,2} 


if  Xj  *  0  and  y,  *  0  and  y,.  t  *  0 

y,.  • 

if  y,  *  0  and  x,  =  0  and  x,_t  *  0 

(1/2, 1/2), 

otherwise 

(33) 

where  (1/2, 1/2)  indicates  that  z,  is  equally  likely  to  be  1 
or  2. 

In  this  channel  it  is  necessary  for  the  encoders  to  use 
some  sort  of  time-sharing  to  achieve  optimum  rates  be¬ 
cause  simultaneous  zeros  or  nonzeros  and  consecutive  ze¬ 
ros  result  in  equally  likely  outputs.  We  take  the  following 
initial  conditional  distribution  (this  choice  does  not  affect 
the  capacity  region  but  simplifies  the  proof): 


(1/2, 1/2), 


if*-0 

if  yx*  0. 


We  will  now  investigate  the  maximum  achievable  rates 
when  transmitter  1  (respectively,  2)  sends  nonzeros  at 
odd-numbered  (respectively,  even-numbered)  times  (with 
no  restrictions  otherwise).  Then,  it  follows  from  (33)  and 
(34)  that 


*2A  +  l" 

z7k  “ 


•^A+l* 

if  y2A+i*o 

(35a) 

(1/2, 1/2), 

ify2*+,*0 

(1/2, 1/2), 

if  X7k  m  0 

if  x2k  *  0 

(35b) 

and  the  capacity  region  of  (35a)  is  obtained  by  interchang¬ 
ing  Ri  and  R2  in  (36).  The  sum  of  these  two  regions 
divided  by  2  (since  each  channel  is  only  used  half  the  time) 
is  found  in  Fig.  l.  Another  example  where  the  capacity 
region  of  the  multiple-access  channel  is  explicitly  com¬ 
puted  is  the  symbol-asynchronous  energy-constrained 
Gaussian  channel  ((21]  is  devoted  to  the  evaluation  of  the 
limit  characterizing  the  capacity  region). 


o.i  o.a  o.i  o.4  o.s  o«  o.r  », 

Fig.  1.  Achievable  regions  with  and  without  frame-synchronism  of  mul¬ 
tiple-access  channel  in  Example  2. 


III.  Frame-Asynchronous  Capacity  Region 


which  means  that  the  channel  is  actually  decoupled  into 
two  identical  memoryless  channels  whose  capacity  region 
is  obtained  as  follows. 

If  (X2ktY2k,Z2k)  are  connected  by  (35b),  then  their 
mutual  informations  are  easily  shown  to  be  given  (in  bits) 
Sv 

<  V.  MH^a  -1M  Xlk  *  o] 

( *2A.  V,  Z2k)  -  A*(l/2  +  P[X2k  *  03(1/2 

^A-lDWtA’u-O] 

vnere  A*(x)  -  -xlogx -(1  -  x)log(l  -x).  All  these 
nutual  informations  are  maximized  simultaneously  by 
TTj*  - 1]  -  P(y2*  -  2]  -1/2,  and  so  the  capacity  region 
>t  the  channel  in  (35b)  is 


0SAS1 

V  V  4  / 

i  £  R,  £l-  p 

0<;/?2z;l -**(£)} 

+  R2  ^  1  ■”  p } 

(36) 

Unlike  frame-synchronous  channels  where  it  is  enough 
to  consider  “one-shot”  models  in  which  each  user  trans¬ 
mits  only  one  codeword,  the  (completely)  frame-asynch- 
•lonous  multiple-access  channel  cannot  be  decoupled  into 
independent  blocks  due  to  the  overlap  between  consecutive 
codewords,  and  the  optimum  decoder  needs  to  decode  all 
messages  simultaneously,  i.e.,  all  outputs  are  useful  in 
•liaking  decisions  about  any  particular  codeword.  Ideally, 
the  goal  would  be  to  analyze  a  model  with  doubly  infinite 
streams  of  codewords  subject  to  an  arbitrary  shift.  How¬ 
ever,  to  formulate  a  well-posed  problem,  it  is  necessary  (at 
'east  within  the  realm  of  channel  block  coding)  to  work 
with  a  finite  number  S,  of  transmitted  codewords  per  user 
and  then  analyze  the  limiting  behavior  of  the  capacity  as 
ff  -*  oo.  Since  the  offset  between  both  strings  is  arbitrary, 
the  approach  we  take  is  to  arrange  the  N  codewords  of 
cacn  user  in  a  ring  (codeword  N  is  followed  by  codeword 
1)  and  to  model  the  offset  by  an  arbitrary  relative  rotation 
of  both  rings  (Fig.  2).  As  tf-+oo,  the  radius  of  the  ring 
oecomes  infinite,  and  the  ring  models  the  desired  infinite 
codeword  streams  offset  by  an  arbitrary  shift  because,  for 
vac'n  output  symbol,  the. boundary  condition  at  infinity  is 
irrelevant.  As  we  will  see  and  should  expect  (because  of  the 
finite  memory  the  codeword  boundaries  become  irrelevant 
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Fig.  2.  Four-nng  with  codeword  length  equal  to  12  and  memory  length 

m  -  2. 


as  the  codeword  length  goes  to  infinity),  the  capacity 
region  (per  channel  use)  of  the  //-ring  does  not  depend  on 
//,  and  therefore,  it  is  not  necessary  to  investigate  its 
limiting  behavior.  The  main  disadvantage  of  an  alternative 
linear  arrangement  of  the  N  codewords  is  that,  due  to  the 
lack  of  synchronism,  not  all  the  codewords  overlap  with 
the  other  user’s  stream,  and  those  that  do  overlap  have 
different  decoding  error  probabilities  depending  on  the 
offset  and  their  relative  location  to  the  boundaries.  Note 
that  this  problem  can  be  partially  avoided  by  restricting 
the  shift  not  to  exceed  the  length  of  a  codeword,  but  since 
the  total  number  of  codewords  is  assumed  finite,  it  can  be 
argued  that  such  an  approach  would  assume  a  certain 
degree  of  cooperation  between  the  transmitters. 

Each  transmitter  encodes  its  //  messages  independently 
(each  message  is  drawn  independently  from  (1,-  •  •,  A/,}) 
and  is  not  restricted  to  use  the  same  code  book  for  each 
message.  While  the  receiver  acquires  the  location  of  each 
codeword  prior  to  decoding  (this  can  be  easily  accom¬ 
plished  using  synchronization  prefixes),  the  messages  are 
encoded  without  knowledge  of  the  relative  rotation.  There¬ 
fore,  the  channel  is  a  decoder-informed  compound  chan-% 
nel,  which  is  equivalent,  from  the  viewpoint  of  finding  the 
capacity  region,  to  a  bank  of  parallel  multiple-access  chan¬ 
nels  (one  per  rotation  value)  sharing  the  same  inputs. 

Theorem  5:  If  (Rv  R2)  is  an  achievable  rate  pair  for  the 
//-ring,  then 

( Rv  R2)e  closure  liminf — Q*N 

\  *-*ao  mV 

c  closure  ( limsup  -Q* 

\  (i  —  oo  n 

-  closure  |  limsup  -  C/j  (37) 

where  Ql  and  C/  are  defined  as  in  (12)  and  (3),  except 
that  the  union  therein  is  taken  only  over  n-dimensional 
distributions  induced  by  stationary  probability  measures. 

Proof:  There  are  N  ( n,  Mf)  code  books  for  the  ith 
user  and  each  of  the  N  messages  are  encoded  indepen¬ 
dently.  Thus  the  ( nN ',  M,N)  juxtaposition  code  book  of  the 
ith  user  for  the  //-ring  consists  of  the  Cartesian  product  of 


the  //  (n,  M,)  code  books.  If  {Ru  R2)  is  achievable,  then 
for  all  e  >  0,  8  >  0  and  all  n  sufficiently  large,  there  exists 
a  (fl/V,  Mf',  M2,  <)  juxtaposition  code  for  the  //-ring  such 
that 


i.e.,  there  exist  N  code  books  for  user  1  and  N 

(n,  M2)  codebooks  for  user  2  (which  are  independent  of 
the  offset)  and  a  decoding  strategy  (which  depends  on  the 
offset)  such  that  the  average  (over  the  set  of  equiprobable 
messages)  probability  of  error  does  not  exceed  «  regardless 
of  the  offset.  Select  one  such  code  and  denote  the  indepen¬ 
dent  messages  of  both  users  by  and 

(Ti,‘--,Tn),  respectively.  Then,  the  Fano  inequality  im¬ 
plies  that 

H(Si,‘  •  •,  S^jZ)  log  A/jv+log2  (38a) 
H(Tlt •  •  • ,  Tn\Z )  <.  e  log  M2  +  log 2  (38b) 
H (S„- .  • ,  SN,  r„-  •  • ,  T„\Z )  <;  <  log  MfMf 

+  log2  (38c) 


where  Z  is  the  distribution  of  the  totality  of  the  outputs  of 
the  //- ring.  If  X  and  Y  are  the  distributions  of  the  AfN 
and  A2n  valued  random  variables  resulting  from  the  en¬ 
coding  of  the  messages  by  the  selected  code  books,  the  lack 
of  frame  synchronism  is  modeled  by  assuming  that  the 
inputs  to  the  //-ring  are  rotated  versions  of  X  and  Y.  If  x 
is  an  //-vector,  then  rT(x)  denotes  an  //-vector  whose 
components  coincide  with  those  of  x  rotated  by  t  posi¬ 
tions,  where  t  e  (0,-  •  •,  A/ - 1),  i.e., 

”•  fl«)  “  (aM- r+l»*  ‘  ■»  al»"  ■*  aM-r)- 

Even  though  it  is  enough  to  consider  a  relative  rotation  of 
both  rings,  it  is  more  convenient  in  the  proof  of  the 
converse  to  allow  a  rotation  of  both  input  rings  with 
respect  to  an  arbitrary  reference.  Denpting  the  rotations  by 
Tj  and  t2,  the  data-processing  lemma  implies  that 

/(rn(X);Z,rT|(K))^/(5I,---,Sw;Z) 

-H(Sv---,Ss)-H(Si---Sn\Z) 

£  (l-<)log//,*-log2  (39) 

where  the  last  inequality  follows  from  (38a)  and  with  a 
slight  abuse  of  notation  we  have  denoted  by  rr(X)  the 
probability  measure  that  assigns  the  same  mass  to 
rT(a,,*”,aM)  as  X  assigns  to  (a„-",aw).  From  the 
independence  of  X  and  Y  and  (39),  we  have  that 

/(r;>(X);Zlrfl(}'))i(l-<)IogA/r-log2. 

However,  since  this  is  true  regardless  of  the  actual  value  of 
the  offsets  t,  and  r2  we  can  write 

1  (i*-i 

(1  —  « ) log  Mf1 — log2  £  —  £  /(rn(*);Z|rfl(n) 
n Tj  **  0 

-/(rr|(*),’Z|c(F))  (40) 

where  the  probability  measure  c(F)  is  equal  to  the  follow- 
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ing  mixture  of  the  probability  measures  rf(Y\  t2  - 

1  *N-\ 

c(y)--y  Z,rr(Y),  (41) 

niT  T-0 

and  the  second  equation  in  (40)  follows  from  the  fact  that 
the  distribution  of  the  conditioning  random  variable  enters 
linearly  in  the  definition  of  conditional  mutual  informa¬ 
tion. 

An  A/-dimensional  probability  measure  p  will  be  re¬ 
ferred  to  as  circulant  if  rT(p)  *  p  for  all  r,  and  we  will  say 
that  p  is  stationary  if,  for  any  subset  {/,  •  ■  •  /',}  C 
(1,-  •  *,M)  and  shift  s  >0  such  that  (q  +  j,*  •  •,/,  +  s)  c 
(1,---,M), 

mP(a,^„‘'‘, u,(+1) 

For  any  probability  measure  p,  c(  p)  (defined  in  (41))  is  a 
circulant  probability  measure.  To  see  this  note  that,  for 
any  {(),•••,  A/},5 

r\(c(p))~Jj  +\)„(p)  mJj  E  rr(p)~c(p)- 

Furthermore,  it  is  easy  to  check  that  an  A/-dimensional 
circulant  probability  measure  is  an  ^/-dimensional  station¬ 
ary  probability  measure.  Now,  since  (40)  holds  for  all 
r,e  {0,---,nN-l}, 

1  nS-l 

(l-.)log«,''-Iog2s—  £  Z|c(T» 

nN  T,-0 

s7(c(*);Z|c(K))  (42) 

where  the  second  inequality  follows  from  the  concavity  of 
mutual  information.  Proceeding  in  a  similar  way  we  obtain 
from  (38) 

(1-  c)log  J/j*-log2  7(c(K);  Z|c(X))  (43) 

and 

(l-«)logA/,wA/f-log2^/(c(^),c(K);Z).  (44) 

If  m  - 1  consecutive  components  of  Z  are  discarded,  then 
we  have  a  channel  analogous  to  an  ntf-block  channel 
whose  valued  output  random  variable  is  denoted 

by  Z"N.  Then,  the  following  upper  bounds  follow  in  a  way 
similar  to  (18) 

/(c(X),;Z|c(K)) 

&l(c(X);  Z"*|c(y))+(m  -l)log|il|  (45a) 
7(c(K);Z|c(*)) 

*  /(c(y);  ZZ"lc(X))+(m  - l)log|A|  (45b) 

I(c(X),c(Y);Z) 

^/(c(^),c(T);Z^)  +  (m-l)loglB!.  (45c) 
Finally,  it  follows  from  (38),  (42)— <45)  and  the  stationarity 


’(/)„  «  (1.-  - A/)  is  equsl  to  the  remainder  l-qbt  where  q  ii  an 
iniejer. 


of  c(X )  and  c(K)  that 
[l-cj(/ll.-5,  R2-$) 

l08J?+^i|08ia| 


(1'1)SnWC-"' 


Thus  if  «>  (log2+(^-l)log|5|]/{$(l-<)].  then 

1 


[l-e](/?,-25.  R2-2S)e—Ql 
which  implies  that 


nN~nN 


[l-c](/?,-25,  R2-2S)  e  liminf  —Q*N. 

/i  —  oo  fin 


However,  since  t  and  &  are  arbitrarily  small.  (A,.  R2)  has 
to  be  a  limit  point  of  a  sequence  of  points  belonging  to 
liminf, ^.<a(l/nN)Q^N,  and  (37)  follows. 


Theorem  6:  The  following  set  is  an  achievable  region 
for' the  N-ring: 


closure 


where 


U  {(7?1,  /?2) ;  0  ^  £  I(iix\pzjpr) 


\  *x-*r 
stationary 


0^  R2£l{pY-,pz\p 

x) 

\ 

Ri  +  R2Zl(Px’Pr'*Pz)} 

(46) 

Iim  -/(*";  Z"|  l"1) 

ft  ^  00  ft 

(47a) 

1 

lim  -7(y";Z"|^") 

n  —  oo  n 

(47b) 

1 

Iim  -/(*",  l"1;  Z") 
o—oo  n 

(47c) 

and  X”,  Y*  are  the  n-dimensional  distributions  induced  by 
the  stationary  probability  measures  ft  x,Py<  anc*  Z"  is  the 
output  of  the  frame-synchronous  multiple-access  channel 
with  memory  when  the  inputs  are  independent  with  distri¬ 
butions  X"  and  Y\ 


Proof:  The  existence  of  the  limits  in  (47)  is  an  easy 
consequence  of  the  stationarity  of  the  inputs,  the  time- 
invariance  of  the  channel,  and  the  existence  of  entropy  rate 
for  any  discrete  stationary  process  (e.g.,  (9)).  The  symbols 
transmitted  by  each  user  will  be  denoted  by 

x-  (**(0.  i-1, •••,/») 

y  *  {y*(0. 

where  n  is  the  codeword  length  and  N  is  the  number  of 
codewords  in  the  ring  Similarly,  the  output  symbols  are 
labeled  by 

i  "  {**(*) i  j-1,  •••.«}. 

If  the  users  were  frame-synchronous,  then  zk(i)  would 
depend  on  {xk(i-  j)  (or  x(Jk_n;(/-  >  +  «)  if  i£j))”Jo 
and  {yk(i-  j)  (or  yik.l)}l(i- j  + n)  if  i£j))"Jo:  The 
lack  of  frame  synchronism  introduces  a  relative  rotation  of 
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the  rings  which  can  be  quantified  by  s  e  {Or  •  •,  N  -1}, 
the  number  of  codewords  shifted,  and  r  e  {0,  •  •  • , «  - 1 }, 
the  rotation  modulo  the  codeword  length.  More  precisely, 
the  input  xk(i)  is  aligned  with  yk(i),  defined  by 


y' 1  \yv-,-i)M(j-r+n)’ 

r<  j£n 

1 Z  j£r 

or,  equivalently, 

(.)mjP<s*kJr  +  i), 

yk  ‘  \Pu+k+ i)v(r  +  7  -  «). 

i  +  rs» 

i  +  r>  n. 

We  will  now  fix  an  integer  /  ^  0  independent  of  all  other 
parameters  and  force  the  decoder  to  discard  the  following 
output  values: 

zk{i),  i£l;  /-  {/  +  m,--*r}u{/  +  m  +  r,-*-,7i} 

which  corresponds  to  discarding  the  l  +  m—l  symbols 
following  the  beginning  of  each  received. codeword.  Note 
that  if  /  is  large  enough,  7*0;  however,  we  will  eventu¬ 
ally  be  interested  only  in  the  asymptotic  behavior  as  n  -*  oo, 
in  which  case  I  is  practically  identical  to  (1,*  •  •,  n).  Note 
further  that  the  relative  shift  r  is  allowed  to  be  any  integer 
{(},•••, fl  —  l},  and  so  the  cardinality  of  each  of  the  two. 
components  of  I  may  grow  linearly  in  n.  We  may  rear¬ 
range  the  codewords  of  Fig.  2  in  the  matrix  form  shown  in 
Fig.  3.  In  this  figure,  each  codeword  of  user  1  occupies  a 
single  row,  whereas  each  codeword  of  user  2  occupies  two 
consecutive  rows.  The  blacked-out  outputs  correspond  to 
the  l  +  m- 1  symbols  following  the  beginning  of  each 
codeword  which  are  discarded  by  the  decoder.  In  connec¬ 
tion  with  this  figure,  it  is  useful  to  introduce  the  following 
notation 

*km  {■**(*').  t-/  +  l,"-r} 

** “  {**(0.  i-l  +  r  +  l, •••,»} 

Pk  “  {A(0* 

2k  m  U*(»).  <-/  +  «,•••,*■} 

2*m  {**(0.  «-/  +  r  +  m, •••,!»} 

{**(/)»  *rl, •••,#,  ie/} 

*  {(**.**)» 


Then,  the  definition  of  the  multiple-access  channel  with 
finite  memory  (10)  implies  that 

N 

Pi\xr(z\x<  y)  “  n  np«(**(0M‘-«+i),-, 

*-1  «•«/ 

**(«).  Pkd  —  m  +  1),- * 

N 

~  n  Pzk\xkYk  izk\xk’  yt ) 

•  Pzl\xftf{  zk\xk  <  Pk  )  (48) 

* 

which  implies  that  the  output  subblocks  { zk,  zk,  k  = 
1,*  •  •,  S }  are  conditionally  independent  given  the  inputs 
and  only  depend  on  their  corresponding  input  subblocks. 
Notice  also  that  the  inputs  with  indices  /  e  {1,*  •  • ,  / }  u  { r 
+  1  ,•••,/•  +  /}  do  not  affect  any  outputs  used  by  the 
decoder. 

We  now  proceed  to  show  that  for  any  pair  (nx’Pr)  °f 
stationary  /-dependent  measures  defined  on  the  infinite 
sequences  drawn  from  Al  and  A2,  respectively,  the  follow¬ 
ing  pentagon  is  achievable 

oPy)  m  { ( ^i>  ^2)  •  <><;*,<;  I(nx;  nz\nY) 
0ZR2Z  Hpy’PzIPz) 

Ri  +  R2£  Hpx’Py’Pz)}  ■ 

To  this  end,  we  must  show  that  for  any  fixed  (Rl,R2)  e 
C(fix,nr),t  >  0,  and  y  >  0,  and  all  sufficiently  large  n 
there  exist  (nN,  M*,  M2,i)  juxtaposition  codes  for  the 
A/-ring  such  that 


log  A/, 

n 


7-1,2, 


i.e.,  there  exist  N  (n,  M,)  code  books  for  user  i  (indepen¬ 
dent  of  the  retative  rotation),  and  a  decoding  rule  (possibly 
dependent  of  the  relative  rotation)  such  that  the  probabil¬ 
ity  that  any  of  the  N  messages  transmitted  by  each  user 
are  decoded  incorrectly  is  not  higher  than  <.  The  codes  are 
chosen  as  follows. 

Random  Coding:  The  N  code  books  of  user  i  are  de¬ 
noted  by  {  {1,*  •  •,  M,)  -*  A1)1ml  and  are  the  outcomes 

of  random  selection  where  each  codeword  in  each  code 
book  is  independently  selected  with  probability 

PFll(m)(a i»-  *  * .  O  -  Px* (*!.*  •  •  -  O 

m  e  {1,*  •  • .  } 

PFtj(m)(bi,’  ••./>„)-  Py{b  1,*  ••,/>*) 

m  s  {1,*  •  • ,  M2) 

where  X "  and  Y”  are  the  independent  n-dimensional 
distributions  induced  by  (fi*,p r)-  The  overall  code  book 


verdO:  multiple* access  channels  with  memory 

of  user  i  resulting  from  the  juxtaposition  of  the  foregoing  1,* 
N  code  books  is  denoted  by  ft:  (1,* .  s  M,}"-*  A1N. 

Decoding:  The  decoder  performs  simultaneous  decoding 
of  the  N  messages  transmitted  by-  both  users,  upon  observ¬ 
ing  z  and  the  rotation  ( r,s ),  in  the  knowledge  of  the  code 
books  /j  and  /2.  The  decoder  selects  the  messages' 
(m„m2)e  {I.-".  W1}wX{l,***,A/2}wif(m„m2)is  the 
unique  pair  that  satisfies 
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•,N})  are  independent.  Then  we  obtain  via  (48)  that 

h(x,y,z)-  E  »o 8  r~u'-L\ — 

*-i  Ptt\t£\zk\yk  / 


,  P2!\xt9t{*k\Xk'9k) 


(52a) 


( fiM,  f2(m2),z)  e  j(;n,S) 


(49) 


■  /  -  £  .  Ptt\xi9tL{^k\xt •  Pk  ) 

E  log—- - 

k  -1 


where  0  <  5  <  y,  and  it  outputs  a  decoding  error  if  there 
are  zero  or  more  than  one  such  pairs.  The  set  J(n,S)  in 
(49)  is  the  set  of  jointly  typical  sequences  according  to 
three  criteria: 


Mn’S)  “  {(*.  $’*)• 


W,(x,j!,:)-i/(X;Z|K) 
n  n 


Z  8 


Ptt\xl-(zt\xt) 

Pt!\xtff{*k  1**  »  Pk) 
Ptt\x[(2*\xk) 

Pti\xtrt{xk\x^yt) 

h(x,y,z)-  E  tog - —fJT\ - 

PthxfYf  ( z*\xk  *  y* ) 

PiiW)  • 


(52b) 


(52c) 


Jl(n>S)  “  {(*.  .M")*- 


Mn’*)  *  ((*•  y^b 


-M{x,y,z)--l{Y-J\X) 
tt  n 


— /3(jc,  y,z)-  -f(X,  Y;Z) 
n  n 


*8 

(50b) 

*6 


\  where  Z" 


where 


Taking  expected  values  of  (52)  with  respect  to  (X,  Y,  Z) 
(50a)  jjjjJ  rccaiung  that  the  inputs  are  stationary  we  obtain  that 

l{X\Z\Y)-Nl{X*'/r\Y*)  (53a) 

/(y;Z|X)-W(r;Z"|^»)  (53b) 

I(X,Y',Z)-NI(X\Y*\Z*)  (53c) 

and  Y*  denote  {Z,(/)f  iel)  and  { ?,(/),  i  =■ 
i),  respectively.  Furthermore,  since  {Z,(i),  is/} 
(50c)  depends  on  the  inputs  only  through  X{-,  Xf,  ?{-,  and  ?*, 
and  since  /iy  is  stationary  and  /-dependent,  { YXL,  f,*}  has 
the  same  distribution  as  { Y{,  Y* }  and  therefore  (53)  can 
be  written  as 

I(X-,  Z\Y  )»///(  X”\  Z”\Y”)  (54a) 

l{Y\Z\X)-Nl{Y\Zn\Xn)  (54b) 

I(X,Y’,Z)-NI{X\Y”’,Z”).  (54c) 

The  probability  that  the  transmitted  messages  (S,,  S2)  * 
(mlt  m2)  are  not  decoded  correctly  given  that  ( /,,  /2)  are 
the  chosen  code  books  is 

„  t  t  .  ,  .  .  „„  A) -P[(fi(«.).P.(«.).z) «•'(»•*)« 

Note  that  the  expected  values  of  the  functions  in  (51) 


*i(*.  ^.f)-log 
h(x,  ytz) m  log 
h(x,  y,f)-log 


PiixAUx,  y) 

Pi\?(i\y) 

Piixt&X’jf) 

Pi\xO\x) 

Pi\xt(i\x,  y) 

Pt(i) 


(51a) 

(51b)* 

(51c) 


evaluated  with  the  distribution  (X,  Y,2)  are  equal  to  the 
mutual  informations  appearing  in  (50).  We  can  decompose 
the  functions  in  (51)  taking  advantage  of  the  assumed 
/•dependence  of  the  inputs,  which  implies  that_the_  random 
variables  { X£,  X*,  k-\,---,N)  (and  { ?£,  Yf,  k- 


3(m{,  m2)  *  (mlt  /w^)  such  that 

Wm[),F2{m'2),Z)eJ(n,8)\{Fx,F2) 
*(/i*  ^)* ( *^i»  ^t)  “  (wti,  m2)j . 

Averaging  over  the  random  selection  of  code  books  and 
invoking  the  union  bound,  we  obtain 


^/>[(Fl(mi),F2(m2),^)  «y(/»,«)|(5i,S2)  -  (/nt,m2)] 

+  E  £  ^[(^i(»«{).  Fi(m'2),z)  e  y3(rt,«)|(Si,  S2)  -  (m„  m2)] 

+  Z  F[(F,(mO,F2(m2),Z)ey,( »,a)KSi.$)  -  0»i.«j)] 

m{  m, 

+  E  ^[(^(^^.^(m^.Z)  e/2(«,5)|(Sl,52)  -  (m1,m2)|. 


(55) 
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The  first  term  in  the  right  side  of  (55)  is  smaller  than  < /2 
for  sufficiently  large  ri  because 

lim  /,[(/i(/n1)fF2(/n2),Z)ej4(nf5)|(S1,52) 

n  -*  oo 

=*  (m,,m2)j  -1,  k -1,2,3.  (56) 

This  holds  because  the  inputs  and  output  are  jointly  sta¬ 
tionary  and  ergodic  (4)  (the  output  is  (/  +  m+  l)* 
dependent)  and  therefore  the  Shannon -McMillan  theorem 
(see  e.g.,  [9])  implies  that  each  of  the  2 N  terms  in  the  right 
sides  of  (52)  converges  in  probability  (when  scaled  by  n)  to 
its  expected  value  (which  may  be  zero  if,  for  example;  r 
remains  finite  as  n-»  co).  (Notice  that  this  holds  even 
though  each  output  subbloclc  z£  (or  z£ )  has  m  -1  fewer 
elements  than  the  corresponding  input  subblocks  x£  and 
(or  x*  and  y*),  since  convergence  is  not  affected  by 
any  fixed  number  of  elements.) 

To  investigate  the  behavior  of  the  second  term  on  the 
right  side  of  (55),  we  will  introduce  the  independent  ran¬ 
dom  vectors  U  and  V  defined  on  and  A\N,  respec¬ 
tively,  whose  distributions  are  px  and  pf,  but  which, 
unlike  X  and  Y,  are  independent  of  Z,  i.e., 

Purii*'  ■?.£)  *  Px(x)Pi(  y)Pt(i) 

m  Pxiii * ) «P ( - i 3( x,  y, f )).  (57) 

If  (x,  y,Z)  e  J3(n,  S),  however,  then  (50c)  implies  that 
exp  (~i3(x,y, f ))  £  exp  (  -  /( X,  Y ;  2 )  +  rti) 
and,  consequently, 

^[(ipiK0.^(«0.^)e-6(».*)KSi,^)-(ml,ma)] 
-  L  Pwt(*,y,z) 

(x.  y,£)mj}(x,t) 

£exp(- I(X,Y;Z)+ n&).  (58) 


Proceeding  similarly  with  the  third  and  fourth  terms  on 
the  right  side  of  (55)  we  obtain  that,  for  sufficiently  large 
n, 

E[emimi(Fl,F1)]^M{'cxp{-l{X',Z\Y)  +  nS) 

+  M2'exp(r  l[Y;  Z\X)+n&) 

+  MfMj'cx p  ( -/( X,  Y ;  Z ) +nS )  +  ~ . 

(59) 

Thus  if  Mx  and  M2  grow  sufficiently  slowly  with  n  we  will 
be  able  to  show  that  for  large  n  the  right  side  of  (59)  does 
not  exceed  <,  Specifically,  we  choose  Mx  and  M2  to  satisfy 


lOg  Af; 
n 


R,- 


8  +  y 

~W’ 


i- 1,2.  (60) 


(This  choice  is  possible  for  all  sufficiently  large  n .)  Then, 
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V 


(59)  is  further  upper-bounded  by 


^exp^nN 
+  exp 
+exp(  nN 


r-«\ 

~-T 


1 


(^[^“/(Ml*) 


\ 


y-S 


—  n- 


\ 


1 


Rl+R1-—I(X,Y-Z) 


) 

(61) 


Using  (54)  and  recalling  that  &  <  y,  it  is  seen  that  if 

1  .  . 

0  £  5  liminf  — /( X*\  Z"|y")  (62a) 

* —oo  n 

1  # 

0  ^  ^  lim  inf  — /(T";  Z"|Z")  (62b) 

* —oo  n 


Rx  +  R2£  liminf  -/( X*,  Y”;  Z"),  (62c) 

a— oo  n 


the"  the  right  side  of  (61)  does  not  exceed  <  for  sufficiently 
large  n.  Therefore,  at  least  one -realization  of  the  code 
books  must  exist  that  results  in  probability  of  error  better 
than  c,  and  so  the  pentagon  in  (62)  is  achievable.  Actually, 
that  region  coincides  with  C(px,pr)  because  Z”  was 
obtained  by  discarding  2(/  +  trs-1)  elements  from  Z"=« 
{ Zt(f ),  /'  ■  1,*  •  • ,  n }  (which  is  the  output  of  the  frame-syn¬ 
chronous  channel  with  inputs  X"  and  Y")  and  therefore 
(cf.  (18)) 

l{Xn\2*\Y*)zl{Xn\Z*\Y') 

£  /( X*\  Z"|r-)  +  2(1  +  m  -  l)log \B\  (63a) 

l(Y\Zn\X*)zl{Yn\Z*\Xn) 

<;  /(y*;  Z*|Z")  +  2(/  +  m  -l)log|5|  (63b) 

I(X\Yn',Z')iiI(X\Y'\Z*) 

<;  I{X\Yn\  Z")+2(/  +  m -l)log|B|.  (63c) 

Hence  we  may  replace  Z*  by  Z"  in  (62),  obtaining  the 
limits  of  (47).  Thus  we  have  shown  that  the  region 


closure  U  U  {(*!.*,): 

\/aO  yx*r 

iUtioeaiy  Wtap 

Ri&  I{Px>  MzlMr) 

QzR2ZI{Py>Pz\Px) 

R\ +  R2  £  ^(Mx>Mr’*Mz)}  j  (^) 

is  achievable. 

It  remains  to  show  that  the  restriction  to  /-dependent 
input  distributions  can  be  dropped  without  changing  the 
region  in  (64).  We  will  do  so  in  three  steps  where  we  show 
that  the  union  can  be  written  over  1)  5-processes,  2) 
ergodic  stationary  processes,  and  finally,  3)  stationary  pro¬ 
cesses. 
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Step  1:  The  5-proce$ses  are  an  important  class  of  sta¬ 
tionary  ergodic  discrete-time  random  processes  (intro¬ 
duced.  by,  Omstein  [16])  that  can  be  defined  as  the  outputs 
of: the  time-invariant  systems  driven  by  independent  iden¬ 
tically  distributed  (i.i.d.)  inputs.  This  is,  in  effect,  a  mixing 
condition  requiring  that  the  influence  of  the  sufficiently 
distant -past  becomes  negligible. _It  was  shown  by  Omstein 
[16]  that  the  closure  in  the  ^metric  of  the  stationary 
/•dependent  processes  is  equal  to  > the  set  of  5-processes. 
The  J-metric  between  two  stationary  ergodic  measures  p 
and  p  is  equal  to  the  minimum  percentage  of  time  samples 
we  need  to  change  a  representative  realization  of  p  to 
make  it  look  like  a  representative  realization  of  p.  Due  to 
the  finite  memory  of  the  channel,  it  is  easy  to  show  that.if 
a  sequence  of  stationary  ergodic  input  measures  M^.M^* 
converges  in  the  J-metric  to  px  and  pr,  then  the  corre¬ 
sponding  output  measures  also  converge  in  the.  J-metric, 
because  one  way  to  generate  a  representative  sequence  of 
pz  is  by  modifying  representative  strings  of  p(x*  and  p^ 
to  get  representative  strings  of  px  and  pY  without  chang¬ 
ing  the  output  samples  unaffected  by  those  modifications. 
Therefore,.  d(pl£\ pz)  <»  m[ d(plx\ M*)+  d{p^\  fiy)]  since 
each  input  value  affects  at  most  m  output  values.  Now, 
since  the  entropy  rate  is  a  continuous  function  of  the 
stationary  measure  under  the  J-metric  [19]  and  the  three 
constraints  in  (64)  can  be  written  as 

/(mx'.MzIMy)  -  H(pY,Pz)  +  H(px)-ff(px,pY,pz) 

(65a) 

^MyiMzlM*)  -  H{px,pz)+  H(py)- H(px, pY,pz) 

(65b) 

HP; *.My*.Mz)  ■  H(px)+ H{py)+ H{pz) 

-H{px,pY,pz),  (65c) 


oh  the  n  +  m  —  1-dimensional  distributions  ot  pty  and 
m(t  *•  Furthermore,  since  the  entropy  rate  is  a  lower  semi- 
continuous  function  in  the  weak  topology,  we  can  write 

liminf//(/i(2>)  *  # (Mz) 

k—  co 

and  (cf.  [17]) 

liminf H(piz)\p<Y})  «  H(pz\py), 

k  —oo 

and 

lim  tf(fizV*>,M</>)  -  H{pz\px,pY), 

k  —  oo 

since  the  latter  expression  is  linear  in  the  conditioning 
measures.  This  implies  that  the  union  appearing  in  the 
achievable  region  can  indeed  be  extended  to  the  stationary 
ergodic  measures. 

Step  3:  This  step  has  a  well-known  counterpart  in  the 
solution  of  the  capacity  of  single-user  channels  with  mem¬ 
ory  (cf.  [11,  sec.  III]).  There,  the  ergodic  assumption  is 
needed  to  invoke  the  Shannon- McMillan  theorem  in  the 
proof  of  the  direct  theorem,  whereas  the  usual  converse 
techniques  upper-bound  capacity  by  the  minimum  of  mu¬ 
tual  information  rates  over  all  stationary  inputs.  A  proof 
that  the  lower  and  upper  bounds  thus  obtained  coincide 
was  given  by  Parthasarathy  [17]  using  the  ergodic  decom¬ 
position  theorem.  Even  though  in  the  multiuser  case  capac¬ 
ity  is  not  given  as  the  maximization  of  a  scalar  function, 
we  can  use  Parthasarathy’s  result  by  noticing  that  all  we 
need  to  show  is  that  for  every  0  s  a  £ 1  (cf.  [21]) 

SUP  G.(px>PY)m  sup  Ga(px,pr)  (66) 

Mi HxPr 
sutioeary  stationary 

ergodic 

where 


G.(M*.My) 


max  aR\  +^1—  a)fl2 
o  s  *  zii**) 


(2a-l)  J(m*5  MzlMy)  +  (1  “  a)  Am*.  My*.  Mz). 
(1 -2«)/(^iy;  pz|Mjy)  +  a^(Mif.  My5  Mz). 


1/2  £  a  £  1 

OsaS  1/2 


(67) 


the  region  (64)  is  unchanged  if  we  enlarge  the  set  of 
stationary  /-dependent  processes  to  its  closure,  the  set  of 
5-processes. 

Step  2:  The  stationary  mixing  multistep  Markov  pro¬ 
cesses  are  5-processes  (the  mixing  condition  essentially 
rules  out  processes  with  periodicities),  whose  closure  in  the 
weak  topology,  is  the  set  of  stationary  ergodic  processes 
[10,  p.  360],  where  we  say  that  p(k)  converges  weakly  to  p 
if,  for  all  n  >  0,  the  n -dimensional  distribution  induced  by 
p<k)  converges  to  that  of  p.  Again,  due  to  the  finite 
memory  of  the  channel  it  is  easy  to  show  that  if  p(x\ pty* 
converge- weakly  to  px  and  pY,  then  the  corresponding 
output  measures  also  converge  weakly,  because  the  n- 
dimensional  distribution  induced  by  pty  depends  linearly 


where  the  second  equality  holds  because  the  maximization 
on  the  left  side  is  attained  at  one  of  the  two  (Pareto) 
optimum  vertices  of  the  feasible  pentagon  (note  that 

J(M*.  My*.  Mz)S/(M/.  MzlMy) +J(My;MzlM*))-  We  may 
now  fix  a  €  [0,1/2],  the  other  case  being  entirely  parallel. 
We  then  obtain  that  for  all  stationary  pairs  px,pY 

G.(px,  My)  -  (1-  a)l(px, pY ;  Mz)“ (l-2a)/(/V,  Mz) 
“(1  -*)jl(pX,,PY\Pz)dPx(x)dPY(y) 

-(1-2  a)fl(px;,pz)dPx{x) 

~  fGm(pXt,pYt)dPx{x)dPY(y)  (68) 
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where  the  second  equation  follows  from  Parthasarathy’s 
representation  theorem  [H],  and  [px, x  e  A, }  and  y 
e  A2}  are  the  stationary  ergodic  measure  in  the  ergodic 
decompositions  of  px  and  pY: 

Px{E)~l  ^x,(E)dPx(x)  (69a) 

Hy(F)~  /  nYr(F)dPY(y)  (69b) 

for  all  measurable  sets  E  and  F.  Notice  that  the  only 
restriction  of  Parthasarathy’s  result  is  that  the  channel 
connecting  input  and  output  be  stationary,  and  this  is  the 
case  for  the  channel  that  connects  ( px,pY )  with  pz,  as 
well  as  the  channel  seen  by  the  first  user,  which  connects 
Hx  and  nz  because  both  pY  and  the  multiple-access  chan¬ 
nel  are  stationary. 

Finally,  for  a  stationary  pair  px,  pY  to  achieve  a  value  of 
Ga(p x,nY)  close  to  the  supremum,  there  must  exist  (x,  y) 
such  that  Gm(\iX[,nY)  is  close  to  the  supre¬ 
mum,  because  the  average  with  respect  to  Px  X  PY  in  (68) 
cannot  be  larger  than  each  of  its  sample  values.  This  fact 
shows  (66)  and  completes  the  proof  of  the  theorem. 

We  will  now  use  Theorems  5  and  6  to  find  the  frame- 
asynchronous  capacity  region  of  the  examples  we  studied 
at  the  end  of  Section  II. 

Example  !:  Since  the  frame-synchronous  capacity  re¬ 
gion  is  achieved  by  stationary  (i.i.d.)  inputs,  it  remains  the 
same  if  the  users  are  frame-asynchronous.  The  same  is  true 
for  the  symbol-asynchronous  Gaussian  multiple-access 
channel  [21]  where  the  capacity  region  is  achieved  by 
stationary  colored  Gaussian  processes.  (In  that  case,  the 
capacity  region  does  depend,  in  general,  on  whether  the 
transmitters  are  symbol-synchronous.) 

Example  2:  We  will  show  first  that  the  triangle  (0  £  Rx, 
0  <>  R2,  Rx  +  R2£ 0.5  bit)  (Fig.  1)  is  an  outer  bound  to 
the  frame-synchronous  capacity  region.  In  the  second  part 
of  the  proof  we  will  show  that  it  is  achievable.  From  the 
definition  of  this  channel  (33),  it  is  easy  to  compute  the 
conditional  entropy  (in  bits): 

H(  Zt\X,_ i,  Y,_v  Yt)  -1  -  0lYj  -  &Yi  (70) 

with 

Yl  -  P  ( X(i)  +  0]  ft  -  ?( X(i)  -  0,  X(i  - 1)  *  0] 

Y2-F[y(/)*oj  &-p{y(/)-o,y(/-i)*o] 

where  the  foregoing  probabilities  are  independent  of  the 
time  i  because  X*  and  Y"  are  stationary.  Since  the  out¬ 
puts  are  independent  conditioned  on  the  inputs,  we  have 

/(*",y";Z;)£  il(X,_l,X„Yl_l,Yl;Zl) 

i  —  m 

t  {i-/f(z,|A;.1,A;,yw,yi)j 

i  ••  fn 

-(«-«  +  1)[yi&  +  Y2A).  (71) 

and  since  yk  +  e  (0, 1]  and  Y*  -  fik  e  [0, 1],  we  have  0k  e 


[0,1/2]  and 

Ti/*i  + Tift”  1/2 

which,  together  with  (71),  implies  that  the  total  capacity  of 
the  frame-asynchronous  channel  is  bounded  from  above 
by 

1 

lira  -  max  l(Xm,YH‘,ZZ)  *1/2, 

it  —  oo  n  X'Y* 
sUtioMiy 

in  contrast  to  the  total  capacity  of  the  frame-synchronous 
channel,  which  is  equal  to  1  bit. 

To  show  achievability  of  the  triangle,  we  choose  both 
input  processes  to  be  stationary  and  Markov  with 

PM^OIJK^-Ol-l 

(i.e.,  Y*  «■  1  -  fik,  k— 1,2).  Then,  as  there  are  no  consecu¬ 
tive  input  zeros,  the  channel  is  equivalent  to  the  memory¬ 
less  channel 


*i- 


y» 

(1/2. 1/2), 


if  x,  *  0  and  y,  *  0 
if  y,  +  0  and  x,  =  0  (72) 
otherwise. 


Furthermore,  we  will  only  consider  inputs  whose  nonzero 
values  are  independent  and  equally  likely  to  be  1  or  2. 
Then  the  outputs  are  independent  both  unconditionally 
and  conditioned  on  either  input  sequence: 

H(Z:)-«-«  + 1  (73) 

ffW)-  i«W) 


-  i  um) 

i  —  m 


«■  (/t  —  m  +  1) 


1-rj  +  rA 


(74) 


where  the  second  equation  follows  from  (72)  and  the 
independence  of  Xa  ami  Y*.  Then  (70),  (73),  and  (74) 
imply  that 

I(X";  Z*\Y*)  ■»  (n  -»  m  +  1 ) 


• 

Yi(1-Y2)  +  Y2| 

’hb\ 

[?N] 

/(y-;z;i^")-(; 

i -m+1) 

• 

Y2(1~Yi)  +  Yi| 

/(^",y";z;)-(«-m+i) 

•[ri(l-r2)+r2(i-ri)]. 


However,  A*(y*/2)^y*>  and  thus  the  LJowing  region  is 
achievable: 

U  {(^i*^2):  0^^i^Yi(1~Y2)» 

Yl  *11/2.11 

1-1.2 


0S*,SY2(1-Yi)}  (75) 
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vnich  can  be  shown  to  coincide  with  the  triangle  f  ( /?„  /?2): 
)HRuQzRlt  Rl  +  R2<i}/2}. 

t  can  be  seen  that  in  this  examnle  full  frame-synchro- 
ious  canaaty  would  be  achieved  if  the  encoders  were 
nformed  of  the  relative  shift  modulo  2.  and  without  this 
nformation.  they  cannot  do  better  than  the  frame- 
:svncnronous  region.  This  points  out  that,  in  contrast  to 
ne  memorvless  channel,  even  a  mild  form  of  asynchro- 
nsm  where  the  shift  mav  be  only  0  or  1  reduces  the 
anacuy  region.  The  reason  is  that  in  the  memoryless 
nannel.  a  large  time-scale  type  of  cooperation  (time-shar- 
ng)  is  enough  to  achieve  capacity,  whereas  in  a  channel 
vith  memory,  the  encoders  may  need  to  cooperate  in  a 
mail  time-scale.  Mild  frame-asvnchronism  only  precludes 
•  oooeration  in  the  small  time-scale. 
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Attract — In  the  infometkm  theory  of  the  muiti pie- access  channel,  two 
types  of  synchronism  are  usually  assumed  among  the  transmitters,  namely, 
frame  and  symbol  synchronism.  Frame  synchronism  refers  to  the  ability  of 
the  users  to  start  the  transmission  of  their  codewords  in  unison.  The  issue 
of  symbol  synchronism  arises  in  continuous-time  channels  in  which  each 
codeword  symbol  modulates  a  fixed  assigned  waveform;  the  channel  is 
symbol  synchronous  if  the  users  cooperate  so  that  their  symbol  epochs 
coincide  at  the  receiver.  In  practice  symbol  synchronism  Is  harder  to 
achieve,  yet  the  only  reported  progress  so  far  has  been  in  the  removal  of 
the  assumption  of  frame  synchronism.  It  is  shown  that  if  the  transmitters 
are  assigned  the  same  waveform,  symbol  asynchroni sm  has  no  effect  on 
the  two-user  capacity  region  of  the  white  Gaussian  chinnci  which  is  equal 
to  the  Covcr-WyiKr  pentagon,  whereas  if  the  assigned  waveforms  are 
different  (e-g.,  code  division  multipie  access),  the  symbol-asynchronous 
capcdty  region  is  no  longer  a  pentagon. 

!.  INTRODUCTION 

THE  MAIN  GOAL  of  the  information-theoretic  study 
of  the  multiple-access  channel  is  to  find  its  capacity 
region,  i.e.,  the  set  of  information  rates  at  which  simultane¬ 
ous  reliable  communication  of  the  messages  of  each  user  is 
possible.  This  problem  was  solved  in  the  pioneering  work 
of  Ahlswede  [1],  [2]  on  the  two-user  discrete  memory  less 
channel;  later,  an  explicit  expression  for  the  capacity  re¬ 
gion  of  the  Gaussian  memoryless  discrete-time  multiple- 
access  channel  was  given  by  Cover  [3}  and  Wyner  [4]. 
These  and  most  of  the  subsequent  results  on  the  subject 
assumed  so-called  frame  (or  block)  synchronism  among 
the  users  in  the  sense  that  the  beginnings  of  the  codewords 
of  each  user  were  guaranteed  to  coincide  at  the  receiver.  It 
has  been  shown  by  Poltyrev  (5]  and,  independently,  by  Hui 
and  Humblet  [6]  that  the  only  effect  of  frame  asynchro- 
nism  on  the  discrete  memoryless  multiple-access  channel  is 
the  removal  of  the  convex  hull  operation  from  the  expres¬ 
sion  for  the  capacity  region.  It  was  recently  shown  (7j  that 
if  the  multiple-access  channel  has  memory,  frame  ssyn- 
chronism  may  drastically  reduce  the  capacity  region  and, 
in  particular,  the  maximum  achievable  rate  sum.  At  any 
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Fig.  1.  (a)  Frame- asynchronous  symbol-synchronous  two-user  channel, 
(b)  Frame-synchronous  symbd-isynchronous  two-user  channel. 

rate,  in  many  practical  situations  it  is  perfectly  reasonable 
to  assume  that  this  type  of  synchronism  is  achievable  with 
a  modicum  of  channel  feedback  or  cooperation  among 
transmitters. 

The  type  of  synchronism  that  is  difficult  to  achieve  in 
many  practical  situations  (owing  to  the  much  smaller  time 
scale  involved)  is  symbol  synchronism.  This  issue  arises  in 
continuous-time  channels  where  each  codeword  symbol 
modulates  a  signal  waveform  of  finite  duration,  as  is  the 
case  in  most  conventional  digital  communication  sys- 
*tems.  In  these  systems,  user  k  transmits  a  codeword 
(bk(  1),*  •-,&*(«))  e  A\  by  sending  the  signal 

£jk(f-iT;h*(i)) 

/-i 

where  the  waveforms  {sk(t\ b),  be  Ak)  vanish  outside  the 
interval  [0,7']  and  constitute  the  fixed  signaling  alphabet 
of  user  k,  which  is  known  to  ail  transmitters  and  to  the 
receiver.  If  the  symbol  epochs  of  the  signals  transmitted  by 
the  users  are  not  aligned  at  the  receiver,  then  the  channel  is 
symbol  asynchronous  (Fig.  1).  For  a  channel  with  two 
senders  and  one  receiver,  assuming  frame  synchronism  and 
an  additive  white  Gaussian  noise  channel  model,  we  can 
write  the  channel  output  as 

X>i(f-«t-T,;bi(i)) 

i-i 

+  Lji(f-/T-r2;h2(i))  +  R  /)  (1.1) 

(-1 
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where  the  delays  or  offsets  tj  e  [0,  T),  t2  e  (0,  T)  account 
for  the  symbol  asynchronisra  between  the  users  and  are 
known  to  the  receiver  (because  it  acquires  the  timing  of 
each  of  the  received  signals  to  decode  reliably  each  of  the 
transmitted  messages)  and  unknown  to  the  transmitters. 

While  the  derivation  of  coding  strategies  for  symbol- 
asynchronous  channels  has  been  addressed  before  (8),  it 
appears  that  no  results  on  the  capacity  region  of  the 
multiple-access  channel  are  available  when  symbol  syn¬ 
chronism  is  not  assumed.  In  this  paper  we  find  the  capac¬ 
ity  region  of  a  fairly  general  symbol-asynchronous  Gauss¬ 
ian  multiple-access  channel  in  which  user  k  modulates 
linearly  a  fixed  signature  waveform  sk(t),  i.e.,  sk{t\  b)  = 
bsk{t).  This  encompasses  many  interesting  channels  in 
applications,  such  as  direct-sequence  spread-spectrum 
code-division  multiple-access  channels  (CDMA)  wherein 
each  transmitter  is  assigned  a  distinct  signature  waveform 
which  is  used  to  modulate  information  simultaneously  and 
independently  of  the  other  transmitters.1  We  focus  our 
attention  on  energy-limited  channels  where  Ak  =  R,  and 
each  codeword  of  user  k  is  constrained  to  satisfy 

-  Zt>l(i)zwk,  *-1,2.  (1.2) 

The  methods  employed  in  this  paper  can  be  used  to  solve 
the  case  where  the  Ak  are  finite  alphabets;  however,  in  this 
case,  as  in  the  single-user  discrete-time  Gaussian  channel 
with  finite  alphabets  or  amplitude  constraints,  no  explicit 
expressions  for  capacity  can  be  obtained. 

If  the  transmitters  are  assigned  identical  signature  wave¬ 
forms  and  are  symbol  synchronous,  i.e.,  r,  =*  t2,  then  it  is 
easy  to  see  that  the  channel  is  equivalent  to  the  standard 
one-dimensional  discrete-time  Gaussian  multiple-access 
channel,  and  therefore,  its  capacity  region  is  given  by  the 
Cover-Wyner  pentagon:  each  individual  rate  is  con¬ 
strained  not  to  exceed  single-user  capacity  and  the  sum  of 
the  rates  cannot  exceed  the  capacity  of  a  single-user  chan¬ 
nel  whose  signal-to-noise  ratio  is  the  sum  of  the  signal-to- 
noise  ratios  of  both  users.  In  this  paper  it  is  shown  that  the 
same  result  holds  even  if  the  users  are  not  symbol  syn¬ 
chronous.  However,  that  is  no  longer  true  when  the  trans¬ 
mitters  are  assigned  different  signature  waveforms.  Then 
the  symbol-asynchronous  capacity  region  is  no  longer  a 
pentagon  and  depends  not  only  on  the  respective  signal- 
to-noise  ratios,  but  also  on  the  similarity  between  the 
ignature  waveforms  quantified  by  their  cross  correlations, 
n  some  aoplications  it  may  be  of  interest  to  use  the 
aDacity  region  found  in  this  paper  for  any  arbitrary 
moice  of  signature  waveforms  as  a  basis  for  optimum 
ignai  design  (i.e.,  to  find  the  elements  that  achieve  the 
oundarv  of  the  union  of  capacity  regions  over  a  certain 
set  of  signature  waveforms)  under  a  variety  of  specific 
constraints  on  the  set  of  feasible  signals,  such  as  direct- 

*Mosi  capacity  ani>ys«  of  the  CDMA  channel  have  focused  on 
single-user  receivers  and  approximated  the  multiple-access  interference 
by  a  white  Gaussian  process  (9]-(ll].  thereby  providing  limited  insigbi 
into  the  fundamental  limits  of  that  channel. 


sequence  waveforms  with  a  maximum  number  of  chips- 
per-symbol  or  signals  approximately  bandlimited  to  a 
specified  bandwidth.  However,  it  is  worth  noting  that  in 
many  practical  applications  the  choice  of  signature  wave¬ 
forms  is  dictated  by  considerations  such  as  jamming  resis¬ 
tance  and  the  use  of  specific  waveforms  selected  from 
families  of  pseudonoise  sequences  with  favorable  cross- 
correlation  properties  (such  as  Gold  sequences  or  maxi¬ 
mal-length  shift-register  sequences). 

The  first  step  in  the  derivation  of  the  capacity  of  the 
symbol-asynchronous  Gaussian  channel  is  to  obtain  an 
equivalent  channel  model  with  discrete-time  outputs.  This 
is  the  purpose  of  Section  II,  where  an  equivalent  discrete¬ 
time  Gaussian  channel  parametrized  by  the  signal  cross 
correlations  is  derived.  The  main  feature  introduced  by  the 
lack  of  symbol  synchronism  is  that  the  channel  has  mem¬ 
ory.  This  is  due  to  the  overlap  of  each  symbol  transmitted 
by  a  user  with  two  consecutive  symbols  transmitted  by  the 
other  user.  The  capacity  of  discrete-time  multiple-access 
channels  with  finite  memory  was  obtained  in  [7]  with  and 
without  frame  synchronization.  Those  results  are  used  in 
Section  III  to  obtain  the  capacity  region  of  the  symbol- 
asynchronous  Gaussian  multiple-access  channel,  which 
turns  out  to  be  independent  of  whether  or  not  the  channel 
is  frame  synchronous.  Since  the  relative  offset  r2  -  r,  be¬ 
tween  the  received  signals  is  not  known  to  the  transmitters, 
we  must  deal  with  a  compound  multiple-access  channel 
where  the  encoders  only  know  that  the  actual  channel 
belongs  to  an  uncertainty  set  parametrized  by  the  relative 
offset.  For  the  sake  of  clarity  of  exposition  we  deal  first 
with  the  case  where  the  relative  offset  is  known  to  all 
parties  (i.e.,  the  uncertainty  set  is  a  singleton),  and  then  we 
use  those  results  to  find  the  sought-after  capacity  region  of 
the  compound  channel.  Finally,  in  Section  IV  we  consider 
an  alternative  representation  of  the  capacity  region  which 
results  in  a  particularly  compact  characterization  of  the 
fundamental  limits  of  the  multiple-access  channel  in  the 
region  of  high  signal-to-noise  ratios. 

II.  Channel  Model 

The  goal  of  this  section  is  to  obtain  a  channel  with 
discrete-time  outputs  whose  capacity  is  the  same  as  that  of 
the  channel  with  continuous-time  output 

y(0- 

i-l 

+  £  bj(i)s2{t-iT-T2)  +  n(t)  (2.1) 

/-t 

where  n(t)  is  white  Gaussian  noise  with  power  spectral 
density  equal  to  a2.  This  goal  is  achieved  by  considering 
the  projection  of  the  observation  process  (y(0)  along  the 
direction  of  the  unit  energy  signals  {*,(/)}  and  {^(O) 
and  their  T-shifts: 

y*(i)-  /l,+I)T+TM'K('-*T-T*)rf/,  *  =  i,2. 

JlT  +  T, 

(2.2) 
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Fig.  2.  Symbol  periods  and  cross  correlations. 


It  is  possible  to  obtain  an  expression  for  { 1  and 
{y2(i)}".i  in  terms  of  the  transmitted  codewords 
{*i(0)7-i  and  {6*(i)}r-,  by  substituting  (2.1)  into  (2.2) 
and  by  defining  the  cross  correlations  between  the  as¬ 
signed  signature  waveforms  (MO)  and  (r2(/)}  as  (assum¬ 
ing  without  loss  of  generality  that  5  t2  (Fig.  2)) 

Pn  =  ft i(0M‘  +  Ti  “  ri) dt  (2-3a) 

Jo 

P2,=  fT5i(t)s2(‘  +  T  +  Tl-T2)dt.  (2.3b) 
Jo 


It  follows  easily  that 


'M0 ' 

as 

0 

Pn 

'Mi- 1)' 

+ 

'  1 

Pn 

MO' 

,M0. 

.0 

0  . 

Mi-1). 

Pn 

1 

M0. 

0  0 

M'+i) 

i 

«,(0 

pi ,  °. 

b2(i  +  l) 

T 

,"2(i). 

for  1  £  /  ^  n  (with  bk( 0)  -  bk(n  + 1)  -  0,  k  -1,2);  the  dis¬ 
crete-time  random  process  {(MO  n2(i)]T)  is  Gaussian 
with  zero  mean  and  covariance  matrix: 


£ 


ni(0 

«2(0 


[«i(y)  «z(y)] 


o2H(i-j) 


where  H(i)  -  0  if  |/|  >  1,  and  1),  H( 0),  and  H(- 1)  are 
the  matrices  appearing  in  (2.4),  i.e., 


/(0)  -  | 


1 

Pn 


Pn 

1 


tf(l)-tfr(~l) 


o  Pn 
0  0 


'ince  the  receiver  knows  the  assigned  waveforms  { MO} 
ma  (  s2{t)}  as  well  as  the  symbol  epochs  { iT  +  }  and 
JT+r2),  it  can  compute  {y1(<)}J,-i  and  { ^2(<)}7-i  by 
passing  the  observations  through  two  matched  filters  for 
signals  (MO)  and  {MO}.  respectively.  The  key  observa¬ 
tion  is  that  this  operation  does  not  destroy  any  informa¬ 
tion  that  is  valuable  in  deciding  which  messages  were 
transmitted.  This  is  because  the  likelihood  function  (i.e., 
»he  conditional  exoectation  of  the  Radon -Nikodym 
lerivative  between  the  measure  induced  bv  the  observa- 
tons  and  Wiener  measure  given  that  {M0}"-i  and 
MO};-,  are  the  transmitted  codewords)  is  equal  to  a 
onstant  times  fe.g.,  (12]) 


which  can  be  factored  into 

M{y(0})M{M0},{M0}.{yi(0}>{y2(0}); 

hence  because  of  the  factorization  theorem  (13],  {^(i)};., 
and  { y2(0};-i  are  sufficient  statistics  for  the  transmitted 
messages.  TTus  implies  that  the  channel  output  {>»(/)} 
enters  in  the  computation  of  the  posterior  probability  of 
each  message  only  through  {MO};.,  and  {MO};-,- 
Thus,  no  matter  which  codeb  xsks  are  chosen  by  the  trans¬ 
mitters,  the  probability  that  the  maximum  a  posteriori 
decoder  selects  the  true  transmitted  message  remains  the 
same  if  instead  of  working  with  the  original  continuous¬ 
time  observations  {>>(0}  die  decoder  is  constrained  to 
work  with  the  discrete-time  sequences  { >*1  ( 0 }  7-  i  and 
{ _y2(0}?— i.  Therefore,  if  a  rate  pair  is  <-achievable  for  the 
multiple-access  channel  (2.1),  it  is  also  t-achievable  for  the 
multiple-access  channel  (2.4),  and  hence  the  capacity  re¬ 
gions  of  both  channels  coincide.  In  this  respect,  notice  for 
future  use  that  if  t(W)  is  a  sufficient  statistic  for  Z,  then 
the  data-processing  inequality  is  satisfied  with  equality 
because  Z  and  W  are  conditionally  independent  given 
f(lF).  Therefore, 

/(Z;  t{W))  -  /(Z:  W,t(W))-I(Z;  W\t{W)) 
~I(Z-  W)  +  l{Z\t{W)\W) 
-I(Z;W).  (2.5) 

Note  that  even  though  channel  (2.4)  has  two  output 
sequences,  it  is  a  multiple-access  channel  rather  than  an 
interference  channel  because  both  outputs  are  available  to 
the  multiuser  receiver.  Channel  (2.4)  is  parametrized  by  the 
cross  correlations  p12  and  p21,  which  depend  on  the  rela¬ 
tive  offset  t2  -  Tj  and,  therefore,  in  general,  are  unknown 
to  the  transmitters.  Consequently,  it  is  necessary  to  analyze 
a  compound  multiple-access  channel  where  the  transmit¬ 
ters  only  know  that  (p12,  p21)  belongs  to  an  uncertainty  set 
determined  by  {MO}  aad  {MO}- 

The  main  characteristic  of  the  discrete-time  multiple- 
access  channel  in  (2.4)  is  that  it  has  memory  because  the 
noise  sequence  is  correlated  and  each  output  value  de¬ 
pends  on  three  input  symbols,  while  each  of  these  symbols 
ufects  two  consecutive  output  vectors  (cf.  Fig.  2).  It  is 
possible  to  obtain  an  equivalent  multiple-access  channel 
(Appendix  I)  whose  noise  process  is  independent  at  the 
expense  of  an  enlarged  set  of  observables.  The  advantage 
of  the  latter  discrete-time  model  is  that  it  is  possible  to 
invoke  coding  theorems  for  channels  where  the  outputs  are 
conditionally  independent  given  the  inputs  (7]. 

If  either  p,2  -  0  or  p21  -  0,  then  the  channel  becomes 
memoryless  because  in  that  case  the  users  are  in  effect 
symbol  synchronous.  For  example,  if  the  users  are  assigned 
the  same  signal  and  the  channel  is  symbol  synchronous, 
both  outputs  in  (2.4)  coincide  and  are  equal  to 

y(/)-h1(i)  +  h2(;)  +  n(/)  (2.6) 


XD 


L  Lbk(i)sk(t-iT- 


*-t  t-t 


dt 


where  {/,(/)}  is  a  Gaussian  independent  sequence.  Then 
the  channel  is  the  conventional  scalar  discrete-time  Gauss¬ 
ian  channel,  whose  capacity  region,  subject  to  the  energy 
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onstraints  in  (1.2),  is  the  Cover-Wyner  region: 

=  <  (/?x,  i?2):  0  rS  ^  —log  1  +  -4 

2  a  . 

1  w2 

0<;A2<;-log  l  +  -j 

2  0  . 

1  w,  w2  ]) 

^i  +  ^^_iogli  +  _  +  _J}. 


If  the  assigned  signals  are  not  equal  but  the  users  remain 
symbol  synchronous,  then  (2.4)  reduces  to  the  memoryless 
multiple-access  channel 


(Of1  pj  M1' 
(OJ  P  1 J  [  ^2  0 


^h'.n  m 

)  n2{i) 


where  {(n1(j)n2(/)]r}  is  an  independent  Gaussian  process 
with  £[nf(<)]  =  a2  and  £[n1(/)n2(i)]  =  o2p,  and 


P=  (\{t)s2{t)  dt. 
Jo 


In  this  case,  the  Cover-Wyner  region  can  be  easily  ge  ner¬ 
alized  (Section  III)  thanks  to  the  lack  of  memory  when  the 
users  are  symbol  synchronous. 


III.  Capacity  Region 

Before  we  obtain  the  capacity  region  of  the  symbol- 
asynchronous  Gaussian  multiple-access  channel,  we  will 
generalize  the  Cover-Wyner  region  (2.7)  to  the  symbol- 
synchronous  channel  where  both  users  are  not  necessarily 
assigned  the  same  waveform.  To  this  end,  according  Jto 
(2.8)  we  need  to  find  the  convex  closure  over  independent 
random  variables  Xx  and  X2  such  that  £[A2]  <,  Wj  and 
£[A22] ^  w2  of  the  union  of  the  pentagons  (0 zRx<. 

/(*»;  t|*2),o  <.  r2<,  i(X2‘,  Y\xo,  £,  +  £2  ^ 

/( Xv  X2,  Y )),  with  the  output  Y  given  by 


l  P]\xA 

P  1  JL^J  [Ni 


where  N,  and  S2  are  jointly  Gaussian  with  zero  mean, 
EfAf/l^a2,  and  E[NXN2]  =  po2.  The  case  |p|=*l  results 
in  the  region  (2.7);  we  will  therefore  assume  |pj  <  1.  Since 
Xy  and  X2  are  independent  random  variables,  the  covari¬ 
ance  of  y  is  equal  to2 


.  ,1  pi  ,  fvar(  Ai)  0 

ov i y )  =»  ;  o2/2  +  ,  v. 

'  'p  1  2  0  var(  A2) 


\l  H 
[p  1 1  j’ 


and  we  can  upper-bound  the  mutual  informations 
I(XltX2iY) 

1  1  fi  I 

^ -logdet[cov(y)]--logdet  o2  ? 

2  2  [p  1 

1 .  ,  f,  1  [var(A1)  0  Ifl  p 

2  °8  2+o2  0  var( A2)  [p  1 


=  2lo«  1  + 


var(Aj)  var(A2) 

ZT. 

a  o 


var(A,)  var(A2) 


/U,;W4'W'’4f'T‘)  “lf‘  4 

2  a  o  OJLP 

i.  i..™w\ 


=  -logM  +  _ 


and  similarly, 


/(A2;y|A1)^ilog(l  + 


var( A2) 


with  equality  in  (3.3),  (3.4),  and  (3.5)  if  Xx  and  X2  are 
Gaussian.  Furthermore,  all  three  rate  constraints  are  si¬ 
multaneously  maximized  by  letting  Xx  and  X2  attain  the 
maximum  allowable  variances,  i.e.,  w,  and  w2,  respec¬ 
tively.  Hence  the  capacity  region  is  equal  to  the  pentagon 

C~j(£1,£2):0<;£l2silog(l  +  ^) 

0<;A2<;^log(l  +  ^) 

1  w,  W2  W.w2  ]\ 

•£,  +  £2^r log  l  +  ^  +  ^  +  ^-O-p2) 
i  a  a  o 


which  differs  from  (2.7)  in  that  the  maximum  rate  sum  is 
no  longer  the  capacity  of  a  single-user  channel  whose 
signal-to-noise  ratio  is  equal  to  (w1  +  w2)/a2.  Notice  that 
when  {ri(0}  411(1  {s2(f)}  are  orthogonal  (p  =  0),  then, 
effectively,  both  users  transmit  in  separate  noninterfering 
channels  and  can  send  information  at  single-user  rates. 
The  A’-user  capacity  region  of  the  symbol-synchronous 
Gaussian  channel  can  be  found  in  [14], 

Jefore  we  state  and  prove  the  formula  for  the  capacity 
of  the  symbol-asynchronous  Gaussian  multiple-access 
channel,  we  will  motivate  the  expression  of  the  capacity 
.egion  by  finding  the  mutual  information  rates  in  channel 
(2.4)  when  the  inputs  are  stationary  Gaussian  processes 
vith  power  spectral  densities  {Sx(u),  we(-ir,ir]}  and 
(S2(w),  u  e  [-  ir.irj}.  Channel  (2.4)  is  a  two-input  two- 


denotes  the  n  X  n  identity  matrix. 
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■utDut  dynamic  linear  time-invariant  system  whose  out- 
>uts  are  embedded  in  colored  stationary  Gaussian  noise.  If 
ne  inDuts  are  stationary  Gaussian  processes,  then  the 
nuiuai  information  rates  can  be- written  as  the  difference 
>etween  the  differential  entropy  rates  of  the  output  with 
=na  without  each  of  the  input  processes.  Consequently,  all 
s  needed  is  an  expression  for  the  differential  entropy  rate 
•i  a  stationary  vector  Gaussian  discrete-time  process.  In 
ne  scalar  case  the  differential  entropy  rate  of  a  Gaussian 
liscrete-time  process  whose  power  spectral  density  is  S(u) 
s  eauai  to  [15,  p.  542] 


i{S)  =  -\o&(2veL(S))  (3.7) 

vnere  L(S)  is  the  geometric  mean  of  S(u),  i.e., 

(5)  =  exp~/  logS(u)du.  (3.8) 

71  *  —  w 

his  follows  because  the  differential  entropy  of  a  Gaussian 
i-vector  with  covariance  matrix  E,  is 
«/2)log(2ire(detEn)l/")  and  according  to  the  Toeplitz 
ustribution  theorem  [16],  lim„_<J0(detE(I)l/"  coincides 
vuh  the  geometric  mean  of  the  Fourier  transform  of  the 
ovanance  scouence.  What  we  need  for  our  purposes  is  a 
generalization  of  this  result  to  vector  random  processes, 
.e..  we  need  to  find  limII_.00(det£)1)1/"  when  £„  is  an 
i-olock  Toeplitz  matrix  whose  elements  are  2X2  covari¬ 
ance  matrices  R(i  -  j) 44  £„(/,  j).  A  solution  to  this  prob- 
em  can  be  found  in  [17]  where  it  is  shown  that  if  the 
>ower  spectral  density  matrix  M(u)  -  E”_  _„e~JWKR(n)  is 
onunuous  and  positive  definite  in  ( -  it,  -rr  ],J  then  the 
oregoing  limit  is  equal  to  the  geometric  mean  of  the 
ieterminanl  of  M(u). 

4ow  the  output  of  channel  (2.4)  is  a  zero-mean  vector 
iaussian  process  with  power  spectral  density  matrix  given 


Therefore,  the  mutual  information  rate  between  the  output 
and  the  inputs  is  equal  to 

1 

lim  XZ;Y”) 


=  h(K)-h(a2T) 


1  /■* 

-  —  /  logdet 


h  +  7 


5,(0))  0 

0  S2(u) 


r(«) 


du 


S,(w)  S2(u) 

o2  c 


Si{u)S2{u) 


du 


(3.9) 


where 


^'(“)  =‘|Pi2  +  P2i^"|2“Pi2  +  p2l+2p21p12c°sw.  (3.10) 
Similarly,  setting  S2(u)  =  0  and  S,(w)  =  0,  respectively,  in 


3.9)  we  get 

1  ,  v  1 

fW  i 

S,(w)) 

lim 

n  —  ca  n  **ir  J 

f  log 

—  W  1 

1  + 

1  i 

|  du 

1  ,  v  1 

(.&(«)' 
1  +  2 

a  . 

(3.11) 

lim  -I(X;;Y*\X ,")=■  — 
n  — 06  n  1  4ff  ■ 

1 

O 

OQ 

J  du. 

(3.12) 

As  mentioned  in  Section  I,  we  will  find  the  capacity 
region  of  the  asynchronous  channel  first  in  the  case  where 
the  transmitters  know  the  offset,  and  hence  the  cross 
correlations  between  their  signals,  and  then  in  the  case 
where  they  do  not. 

Theorem  1:  The  capacity  region  of  the  energy-con¬ 
strained  asynchronous  Gaussian  multiple-access  channel 
when  the  transmitters  know  their  mutual  offset  is  given  by 


Ij  '(RvR2),Q£  Ri^  —  /  Iog(l+  -^2—)  du,  0<J  R2£  —  f  log ( 1  + 


S2(u) 


i  (u)  iO,u€(-  ».»] 

l 

.  "1.2 


du 


f*  (  5,(o))  Sl{u)Sl{U)  f«  2  2-1  ])  ,  )  > 

i,  +  R2  £  —  J  log  ( 1  +  — h  ^ - [1  -  Pu  “  Pn- 2  P12P21  cos  u]  Jduj.  (3.13) 


>v 


•:«)  =  r(«)| 

( u) 


Sj(w)  0 

0  S2{u) 

1  Pl2  +  P2I*“'“ 
Pl2  +  P2lC;U  1 


T{u)+o2T{u) 


*n  the  ore  sent  case  the  power  spectral  density  of  the  output  vector 
>rocess  is  indeed  continuous,  but  in  problems  with  heavily  correlated 
vavetorms  it  may  fail  to  be  nonsingular  at  particular  frequencies.  How- 
■*er.  the  canadty  region  is  derived  later  in  this  section  without  imposing 
nv  01  those  assumptions. 


Proof:  It  is  shown  in  [7,  Theorem  3]  that  the  capacity 
region  of  the  frame-synchronous  discrete-time  multiple- 
access  channel  with  finite  memory  where  the  outputs  de¬ 
pend  on  several  consecutive  input  symbols  and  are  condi¬ 
tionally  independent  given  the  inputs  is  equal  to 

C  =*  closure  [  lim  inf- C„)  (3.14) 

n  —  00  n  j 

where  Cn  is  the  following  achievable  region  for  the  n-block 
memoryless  multiple-access  channel  whose  input  symbols 
correspond  to  n  consecutive  channel  uses  of  the  original 
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channel  with  memory: 

c,3*  U 

x'.xi 

0  £  *3  £/(*;;  Y'\X?) 

£"  +  RteI(Xf,XZ;Y*)}  (3.15) 

where  the  union  is  over  the  independent  random  variables 
X i  and  X2  satisfying  in  the  present  case  E[XkTXk]  <,  wk, 
k -1,1 

The  aforementioned  class  of  multiple-access  channels 
with  finite  memory  includes  as  a  special  case  the  discrete- 
time  channel  in  (1.3)  whose  noise  sequence  is  independent. 
It  does  not  encompass  channel  (2.4)  directly  because  the 
noise  sequence  therein  is  dependent.  However,  since  the 
observables  of  (2.4)  are  sufficient  statistics  for  the  inputs 
and  are  deterministic  transformations  of  the  redundant 
observables  in  (1.3),  not  only  their  capacities  coincide  but, 
according  to  (2.5),  the  mutual  informations  arising  in  the 
achievable  regions,  C„,  of  the  respective  induced  n-block 
memoryless  channels  are  also  equal.  Therefore,  it  is  enough 
to  show  that  the  closed  set  in  (3.13)  is  equal  to 
lim„_co  (l/n)CB,  where  C„  is  the  achievable  region  in 
(3.15)  for  the  n-block  memoryless  multiple-access  channel 
induced  by  (2.4).  To  this  end  it  is  easy  to  check  using  (2.4) 
that  the  n-block  multiple-access  channel  can  be  written  as 


b,{n) - - -O - “-y, (n) 

J, 

6,(0)— ^ - <j> - *-yj(n) 

njtn) 

Fij.  3.  n-block  metnorylesj  two-user  channel. 

As  in  (3.3),  (3.4),  and  (3.5)  we  can  upper-bound  the  mutual 
informations  by 

/( Xf,  X2;  V")  Z  \  logdet  [cov(  Y"))  -  l-  logdet  [a2R ] 

-  ^  logdet  [/,„  +  o-2£[^"A"'r]/fj  (3.19) 


Y*- 


n(i) 

U  i) 
H 2) 

Yt(n) 

Yi(n) 


1  Pit 
Pt:  1  P21 

P2l  1  Pl2 


P2l  1 
Pl2 


*2U)  ^2(1) 

Xv(2)  lVi(2) 

*  .  + 

j2  XM)  Nil") 

J  *2(»)  ff2(n) 


(3.16) 


which  is  depicted  in  Fig.  3  and  where  according  to  (2.4), 
the  noise  vector  is  Gaussian  with  zero  mean  and  covari¬ 
ance  matrix  o2/?,  where  R  is  the  block  diagonal  2n  x2n 
cross  correlation  matrix  multiplying  the  input  vector  in 
(3.15).  This  is  a  positive-definite  matrix  because  xTRx  is 
equal  to  the  energy  of  L?.iE*_lXjk(/)j*(r  -  rk  -  IT)  which 
is  guaranteed  to  be  nonzero  if  x  ^  0,  pl2  *  0,  and  p21  #  0. 
Throughout  this  proof  we  assume  that  pl2  *  0  and  p2l  ¥•  0; 
otherwise,  the  channel  is  equivalent  to  a  symbol-synchro¬ 
nous  channel  and  the  capacity  region  is  given  by  (3.6) 
(which  coincides  with  (3.13)  because  if  pnp2t  -  0,  then  the 
three  rate  constraints  therein  are  maximized  by  white 
spectra). 

The  output  covariance  matrix  is  equal  to 

cov(y")-«[o2/2.+  £[ra"r]^]  (3.17) 

where4 

X"  -  (*t(l).  *2(1).  Xx(2),  •  •  •  Xt(n),  *2(n)] r 

+**«[}].  (3.18) 

i»B  denotes  the  Kro  Decker  oroduct  of  the  matrices  A  and  B. 


and 

/(*";  Y*\X ?)  £  -  logdet  [/„  +  a-%]  (3.20a) 

/( XZ;  y"|^")  £  \  logdet  [/„  +  0-%]  (3.20b) 

where  1k  -  cov(Xk),  k  -1,2,  and  equality  holds  in  (3.19) 
and  (3.20)  if  X "  and  X*  are  Gaussian.  The  following 
identity  whose  proof  is  in  Appendix  II  gives  an  explicit 
expression  of  (3.19)  as  a  function  of  2t,  S2,  pl2,  and  p2l. 
Lemma  1:  The  following  identity  holds 

det[/2B  +  a"2£(A’"A’"7')/f] 
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Therefore,  (3.15)  reduces  to  the  following  union  over  all 
trace-constrained  nonnegative-definite  n  x  n  matrices: 

C„=  U  (( Rlt  R2), 0  <S R,<;  ^logdet 
2*2:0  l  1 


-  U-2*  £  wk 
n 


0  <,R2<,  -  logdet  [/„  +  a-222] 


1  i  f2 

+  ^2  ^  7  logdet  /2„  +  —  0‘ 


K  sT]]\ 

s  A  JJ/' 


(3.23) 


Region  C„  is  a  convex  set  because  each  of  the  three  rate 
constraints  in  (3.21)  is  a  concave  function  of  (22, 22).  This 
is  a  consequence  of  the  fact  that  logdet[^]  is  concave  in  A 
if  A  is  a  positive-definite  matrix  (18,  p.  125].  Note  that 
even  though  the  determinant  appearing  in  the  rate-sum 
constraint  is  not  that  of  a  symmetric  matrix,  it  is  equal  to 
the  determinant  of  the  positive-definite  matrix 


where  QTQ  = 


There  is  no  covariance  pair  (2,,  22)  that  maximizes  ail 
three  rate  constraints  in  (3.23)  simultaneously.  This  is  in 
contrast  to  the  symbol-synchronous  channel  where  we  saw 
that  the  mutual  informations  in  (3.3)— (3.5)  are  maximized 
simultaneously  by  a  pair  of  input  distributions,  thus  result¬ 
ing  in  a  capacity  region  which  is  equal  to  a  pentagon. 
Nevertheless,  we  will  be  able  to  show  that  there  is  a  set  of 
optimum  eigenvectors  for  each  user  in  the  sense  that  it  is 
enough  to  take  the  union  in  (3.23)  only  over  the  subset  of 
covariance  matrices  having  those  eigenvectors,  thereby  ef¬ 
fectively  reducing  the  union  to  one  over  diagonal  matrices. 
To  prove  this,  the  first  step  is  to  apply  the  singular-value 
decomposition  theorem  to  the  matrix  S  defined  in  (3.22). 
According  to  this  result  (19,  p.  192],  we  can  write 

S»UDV  (3.24) 

where  U  and  V  are  orthogonal  matrices  (of  the  eigenvec¬ 
tors  of  SST  and  STS,  respectively)  and  D  is  a  diagonal 
matrix  of  the  singular  values  {d,}^  of  5,  i.e.,  the  non¬ 
negative  square  roots  of  the  eigenvalues  of  the  nonnega¬ 
tive-definite  Jacobi  matrix 


Fig.  4.  Decoupled  n-block  memory  less  two-user  channel. 

Now,  using  the  orthogonality  of  U  and  V,  we  can 
express  the  determinant  in  the  rate-sum  constraint  in  (3.23) 
as 

<  ;M4*4[U][n1Ho 


,  ,  1  A,  0 

,’"  +  7  0  A, 


lr  i  y*^y  v^vd 

“  dCt |/2"  +  a2[U*2^JD  W2JJ  JJ 

UA,  0  ][/„  Dll 

,2[o  a2J[z>  /JJ 


(3.26) 


where  we  have  set  A2  =  F*2,F  and  A2  =  //r22t/.  Since 
tr(  A*) »  tr(2*),  2*  £  0  if  and  only  if  A*  ^  0,  and 

det[/,  +  o'2A*]  -  det[/B  +  o'22j,  (3.27) 

the  region  in  (3.23)  is  equal  to 

C„-  U  ((*i,/y,0<sRl:4logdet[/)l  +  a-2Al] 

A*  2  0  '  L 


trA*  5  n w* 
*  “1.2 


•0  £R2&-  logdet  [/„  +  g“2A2] 


1  1  fA,  0  1|7„  Dll) 

*,+  «lS-logdtl[fl.  +  ^0  AJ[0  fJjj. 


(3.28) 


p\z  P12P21 
P12P21  P12 +  P21  P12P21 


P12P21  P12 +  P21 


Pl2  +  P2t  P12P21 
P12P21  Pl2  +  P21 


(3.25) 
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Thus  in  effect  the  singular-value  decomposition  theorem 
has  allowed  us  to  substitute  the  matrix  S  in  (3.23)  by  the 
diagonal  matrix  D.  This  is  advantageous  because  the  set  in 
(3.28)  is  actually  the  capacity  region  of  the  two-user  Gauss¬ 
ian  memoryless  channel  shown  in  Fig.  4.  This  channel 
differs  from  the  one  in  Fig.  3  in  that  the  inputs  corre¬ 
sponding  to  different  coordinates  do  not  interfere,  and  the 
noise  covariance  matrix  is 


«i  (0 

«a(j) 


1  •  (3.29) 


matrices;  hence  we  can  now  write 


n  l  2n  «-I  V  0  J 


t 

sw, 

*-1.2 


0£R!4,?1i4+7) 


Therefore,  the  singular-value  decomposition  of  S  effec¬ 
tively  decouples  the  original  channel  in  (3.16)  into  inde¬ 
pendent  2x2  multiple-access  subchannels.  The  capacity 
region  of  this  decoupled  channel  is  achieved  by  input 
distributions  whose  coordinates  are  independent.  To  prove 
this,  we  will  show  that  the  rate  constraints  in  (3.28)  are 
maximized  by  diagonal  matrices  A,  and  A*  and  there¬ 
fore,  the  matrices  of  optimum  eigenvectors  for  2,  and  22 
are  V  and  U,  respectively.  First  we  apply  the  Hadamard 
inequality  (the  determinant  of  a  nonnegative-definite  ma¬ 
trix  is  upper-bounded  by  the  product  of  its  diagonal  ele¬ 
ments)  to  the  individual  rate  constraints  in  (3.28): 


(3.32) 


It  remains  to  show  that  the  limit  as  n  -*  oo  of  the  set 
(3.32)  is  equal  to  (3.13).  The  approach  we  follow  is  to  show 
that  the  Pareto-optimal5  rate  pairs  of  (3.13)  coincide  with 
those  of  lim,_00(l /n)CB. 

The  integrand  in  the  rate-sum  constraint  of  the  region  C 
in  (3.13)  is  equal  to  (cf.(3.9)) 


logdet 


/,  +  ■ 


St(  a)  0 
0  S2(u) 


T(U) 


^  logdet  [/,  +  o-2A*] 


it  — 1,2 


(3.30) 


where  \t,  is  the  ith  diagonal  entry  of  A*,  and  equality 
holds  in  (3.30)  when  A*  is  diagonal.  Second,  to  upper- 
bound  the  rate-sum  constraint  in  (3.28)  in  terms  of  the 
diagonal  elements  of  A,  and  A2,  we  will  invoke  the 
following  result  proved  in  Appendix  III. 

Lemma  2.  Let  A  and  B  be  n  X  n  nonnegative-definitc 
matrices,  and  let  A  =  diagf^,*  •  -,3,),  where  8,  is  a  com¬ 
plex  scalar  such  that  |8,|  ^1  for  j *,n.  Then 


det 


/2*  + 


A 

0 


0 

■/„  A*T 

B 

A  Ini 

^  n  {l+  a  I,  +  b„  +  a„b,i(  1  - |8,|)2}  (3.31) 


with  equality  if  A  and  B  are  diagonal. 

We  apply  Lemma  2  to  the  case  A  *=  o'2A,,  B  =  o"2A2 
and  A  =■  D,  where  the  singular  values  of  S  (i.e.,  the  diago¬ 
nal  elements  of  D)  ar-  real  numbers  belonging  to  the 
interval  (0,1)  since  R  is  positive-definite.  (See  (19,  p.  382], 
and  (II.7).)  It  then  follows  from  (3.30)  and  Lemma  2  that 
the  three  constraints  in  (3.28)  are  maximized  by  diagonal 


which  is  a  concave  function  of  (S,(w),S2(u))  for  all 
u  e  [  -  v,  it].  Then  C  can  be  shown  to  be  a  convex  set  by 
following  the  reasoning  we  used  to  show  the  convexity  of 
C„.  However,  if  the  closed  set  C  is  convex,  then  each  of  its 
Pareto-optimal  rate  pairs  has  the  property  that  it  attains 
the  maximum 


max  £»/{,  + (1  —  a)R2  (3.33) 


for  some  0  ^  a  <.  1  (see  [20]). 

For  each  spectral  pair,  denote  the  rate-sum  constraint  in 
(3.13)  by 


v  l  *  i  s,(«)  s,(«) 

(3.34) 


Notice  that  the  individual  rate  constraints  in  (3.13)  are 
F(SV  0)  and  F(0,  S2),  respectively.  Furthermore,  to  sim¬ 
plify  the  notation,  the  L,(-  w.ir]  subset  of  power  spectra 
satisfying  (l/2ir)f*wS(u)du  <iw.  will  be  denoted  by 


5An  achievable  rate  pair  (/?,.  R.)  is  Pja-to-optim.il  if  no  other  pair 
(A, +  +  with  J,  a  0  and  5.  > 0  is  jehiesable.  For  example,  in 

the  pentagons  ( 2.7)  and  (3.6)  only  the  points  on  the  boundary  of  the 
capaaty  region  belonging  to  the  135*  segment  are  Pareto  optimal 
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P(w).  Then  for  every  0  £  a  «£l,  (3.33)  is  equal  to  If  a  =*  1,  then  (3.38)  follows  immediately  because  the  maxi¬ 


max  max  aR.  +  (1  —  a)R2 

SteP^)  OsKjS  ffO.Sj) 


=  max  max{«/'(51,0)  +  (l-a)[i:'(51,5z)-/’(S1,0)],a[/’(51,52)-/’(0,S2)]+(l-ir)/-(0,Sj)} 

Si  6  P(wt) 

Sl<ZP(  »Y> 


max  (2a  —  1)F(51,0)  +  (1  -  a)F(Slt  S2) 

S,eP(w,) 

(  S^ePi*}) 

max  (l-2a)F(0, S2)+  aF(Si,S2), 

SteP(w,) 

[SjGPfHi) 


where  (3.35)  follows  from 

F(Sl,S2)^F(Sl,0)+F(0,S2).  (3.36) 

Following  the  same  approach  with  the  convex  set  (3.32), 
we  obtain  that  every  Pareto-optimal  pair  in  (l/n)C„  at¬ 
tains 


1 

if  -  ^  a  ^  1 

1  (3-35) 

if  0  £  a  £  - 

2 


mizing  arguments  in  the  left  and  right  sides  therein  are 
easily  shown  to  be  the  constant  functions  <1 **(<•))  =  <#>*,= 
wk/a2,  u  S  [-  ir, it],  /  *1,*  ••,/!,  k  =  1,2.  If  1/2  £  a  <1, 
we  invoke  the  following  result  (proved  in  Appendix  IV) 
regarding  the  optimization  problem  in  (3.38). 


max  a/Jj  +  (l  — a)/J2 

(N,.N,)G-C. 


max  J-  t  (2a-l)log(l  +  -y]  +  (l-a)log(l+-Y  +  -Y  +  -^(l-d<2)], 
2n  \  a  j  \  a  a  o  j 


- 


l 
— 
n 

A  -1.2 


max  —  £  (l-2a)Iog(l+-^j  +  alog(l  +  -y  +  -|  +  -^i(l-d/2)], 
tlliO,i-l,  .Mi  2rt  \  a  J  \  a  o  a  j 

zixk,^wk 


1-1 

*-1.2 


if  -  <>  a  <,  1 


if  0  £  a  <;  - 


(3.37) 


for  some  0<;a^l.  To  show  that,  for  every  0 £ a g  1,  the 
limit  as  n  -» oo  of  the  right  side  of  (3.37)  coincides  with 
(3.35),  we  will  fix  1/2  <;a^l  (the  proof  for  0  <,  a  <  1/2  is 
identical)  and  we  will  prove  that 

max^—  /  y(4»1(«),<t2(«),p2(u))du 

M3) 


Um  max  (3.38) 

/l-*O0  +tj£(M"l.aaa.fl  Z/t  /fllj 
1  wk 

:-1.2 


vhere 


(zt,Z2,p2)  =-(2a-l)log(l  +  zI) 

( 1  -  a)  log(l  +  zt  +  z2  +  ZjZ2(l  -  p2)). 

3.39) 


Lemma  3:  If  1/2  £  a  <  1,  then 

’■"I?)  " 

-  — /  ^s(p!(  <■>),*, .*,)  Aj  (3.40) 
where  0t,  82  are  positive  scalars  such  that 

t-T  y*(p2(").^i.^)^«”-I.  k  “1,2  (3.41) 

ztr  J  -  w  o 

0'{p2,M2)  -/(yi(p2.di.^).Y2(p2.di.d2).P2)  (3-42) 

una  yk(-,8l,02),  k  =*1,2  are  continuous  functions. 

Proceeding  as  in  the  proof  of  Lemma  3,  it  follows  that 
the  same  result  holds  for  the  finite-dimensional  optimiza- 


742 


IEEE  TRANSACTIONS  ON  INFORMATION  THEORY,  YOU  35,  NO.  4.  JULY  t9!J# 


tion  problem  in  the  right  side  of  (3,38),  i.e.. 


max 


♦i.jOi"!.--.*  2 n  t 
i  k’, 

-i:.i 


1  =  ^  i  g{d?M) 


i-i 


n  o* 

A  -1.2 


(3.43) 


with  9U92  such  that 

-ir k=l'2-  (3M) 

n  ,-i  0 

Since  for  any  pair  of  signal-to-noise  ratios  (wl/a2,  w2/a2) 
there  exist  solutions  9X  and  t0  (3-41)  and  to  (3.44),  the 
identity  in  (3.38)  will  follow  if  we  can  show  that  for  every 
fixed  positive  pair  (9lt92) 

iim  —  £  g{df,9lt92)  =  —  /  g{p2(u),6l,92)  da 

ii  -*oo  L1X  f  _  j  hTT*-.* 

(3.45) 


and 


lim  -  iyk(dl,9u92)^— -f  yk{p2{^),0u92)  da. 

n—  00  n  ■  _  J  IV  J  -  r 

(3.46) 


To  prove  this,  we  need  to  examine  the  behavior  as 
n  -»  oo  of  (dt2,-  •  -,d2),  the  eigenvalues  of  the  Jacobi  ma¬ 
trix  STS  in  (3.25).  It  can  be  shown  that 

d?  =  p2u  +  pll  +  2pnp210l,  i»l, •••,/»  (3.47) 

where  {/?„  /  =  1,- --,/»}  are  the  roots  of  the  nth  degree 
polynomial  Tn+l(x)  obtained  through  the  recursion6 

Tk+i(x)  "  2.xTk(x)  —  Tk_l(x) 

7i(*)-i;  t0{x)~-~. 


In  special  cases  it  is  indeed  possible  to  obtain  closed-form 
expressions  for  the  eigenvalues  of  STS,  for  example,  if 
Pu  =  Piv  11160  (221 


rf.*"p5a  +  p5i+2Pi2PJic9* 


/  =  !,*•  *,n. 


(3.48) 


At  any  rate,  it  is  easy  to  show  that  the  eigenvalues  of  the 
Toeplitz  matrix  T  obtained  by  substituting  the  entry 
{STS)n  *  pj2  by  +  pfi  are  equal  to  (22) 

~  p\i  +  Pn  +  lPnPn005 

(3.49) 

Thus  if  df  were  replaced  by  d}  in  the  left  sides  of  (3.45) 
and  (3.46),  these  equations  would  follow  immediately  be¬ 
cause  of  the  continuity  of  the  Riemann  integrands  therein. 
It  is  indeed  valid  to  carry  out  this  replacement  because  T 


‘Except  for  (be  initial  condition*,  this  i*  the  recursive  definition  of  the 
Chebysbev  polynomial*,  «bo*e  zero*  are  trigonometric  (21). 


Fig.  5.  Symbol-asynchronous  Gaussian  capacity  region  for  rectangular 
signals  and  identical  signal-to-noise  ratios,  when  transmitters  know 
offset  between  their  signal*. 


and  STS  differ  in  only  one  entry  and  are  uniformly 
bounded  in  operator  norm;  thus  they  are  asymptotically 
(as  their  dimension  grows)  equivalent  [23],  i.e.,  STS  is 
asymptotically  Toeplitz,  and  since  g(-,9ly92)  and 
y k{-,0l,92)  are  continuous  functions,  their  averages  evalu¬ 
ated  at  {d2)1^l  and  {d?}”„l  coincide  as  rt~* oo  [23, 
Theorem  2.3}.  Hence  (3.45)  and  (3.46)  hold  and  the  proof 
of  (3.38)  is  complete. 

Finally,  note  that  Theorem  1  was  proved  under  the 
assumption  that  the  transmitters  are  frame  synchronous. 
However,  it  follows  from  the  results  in  [7]  that  the  same 
capacity  region  holds  even  if  the  transmitters  are  frame- 
asynchronous  because  the  capacity  region  is  achieved  by 
stationary  distributions. 

Fig.  5  shows  the  capacity  region  of  the  simplest  possible 
symbol-asynchronous  Gaussian  multiple-access  channel: 
the  transmitters  are  assigned  the  same  rectangular  wave¬ 
form  and  know  the  offset  between  their  signals.  The  nu¬ 
merical  computation  of  the  capacity  region  is  carried  out 
using  the  results  (IY.5-IV.13)  of  the  functional  optimiza¬ 
tion  problem  solved  in  the  proof  of  Lemma  3.  The  worst- 
case  offset  between  the  signals  is  zero— in  which  case  the 
channel  is  symbol  synchronous  and  admits  the  scalar  dis¬ 
crete-time  model  (2.6)  resulting  in  the  Cover-Wyner  ca¬ 
pacity  region  (2.7).  The  most  favorable  case  occurs  when 
the  symbol  offset  is  equal  to  half  of  the  symbol  period,  in 
which  case  the  outer  region  in  Fig.  5  is  the  capacity  region 
C  in  (3.13)  computed  with  p,2  -  pj,  =  0.5.  This  capacity 
region,  which  is  representative  of  that  of  any  striedy  asyn¬ 
chronous  channel  (i.e.,  when  pl2  and  p21  are  both  nonzero). 
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resembles  a  pentagon  with  smooth  comets.  As  we  saw,  the 
reason  the  region  C  is  not  a  pentagon  is  that  there  is  no 
unique  pair  of  spectral  densities  in  (3.13)  that  maximizes 
all  rate  constraints  simultaneously.  Consider  the  pentagon 
defined  by  points  B  and  B'  in  Fig.  5.  This  is  the  subset 
of  (3.13) 

{(*!,  R2):Q^Rl£  F(S{, 0), 0  <;  R2<.  F( 0,  Sf), 

R\+  R2^f(s*  ,s*  )} 

achievable  with  the  unique  spectral  pair  (S*,  Sf)  that 
maximizes  the  rate-sum  constraint,  i.e., 

F(S*,S*)=.  max  F(S„S2 )  (3.50) 

»*) 

SjG  P(^) 

and  the  rate-pairs  B  and  B’  correspond  to  (Rlt  R2)  = 
(/W.0),  F(SX*,  S2*)-  F(Sl\0))  and  (Rlt  R2)  =* 
(F(S,*.  5,*)-  F(0,  Sf),  F(0,  Sf)),  respectively. 

Note  that  according  to  the  optimality  conditions  in 
(IV.3),  (IV.4)  (particularized  to  a»  1/2),  the  spectral  pair 
(S®,  Sf)  is  the  solution  to 


1 1  !+»;(«)  0\ 

m\29k  l  +  4>/(o>)(l-p2(«))’  )’ 

(/.*)-  (i,2),  (2,i) 

where  (0,,  02)  is  chosen  so  that  (IV.2)  is  satisfied.  Since  p,2 
and  P21  are  nonzero,  p^w)  is  not  a  constant  function  of  a, 
and  hence  neither  are  S*(a)  and  Sfiu).  However,  the 
individual  rate  constraints 


1 

4w 


du 


are  maximized  over  the  set  P(wk)  by  the  constant  func¬ 
tions  Sk(u) -  wk  and  thus  (5®,  Sf)  fails  to  achieve  the 
largest  possible  individual  rates.' 


*-1,2. 


These  rates  are  achieved  (by  one  user  at  a  time)  at  the 
points  A  and  A'  in  Fig.  5.  Point  A  is  achieved  by  the 
spectral  pair  (wlf  Sf),  where 


*■("1  .$*)- 


=  max 


S2(a>) 


a 


2 


(3.51) 


i.e.,  S2  is  the  best  spectrum  for  user  2  when  user  1 
transmits  at  full  single-user  speed  (l/2)log(l  +(w,/o2)). 
The  solution  to  (3.51)  is 


w, 


S2(u)  =  o2max<  fi  - 


1  + 


l  +  ^(l-p2(o>)) 


,0)  (3.52) 


where  /?  is  chosen  so  that  l/2ir  JS£(u)  du>  *  w2.  Note  that 
(3.52)  admits  the  classical  water-filling  interpretation  (24], 
[25]  arising  in  the  study  of  colored  Gaussian  single-user 
channel  capacity. 

The  segment  uniting  A  and  B  does  not  belong  to  the 
boundary  of  the  capacity  region,  and  therefore,  C  is  not  a 
heptagon.  This  property  which  is  illustrated  by  the  capac¬ 
ity  region  in  Fig.  5  can  be  proved  as  follows.  Choose 
1/2  <  a*  <1  such  that  the  rate  pairs  A  and  B  (and  their 
convex  combinations)  achieve  the  same  value  of  the  func¬ 
tion  a*R2  +  (1  -  a*)R2,  i.e.  (cf.  (3.35)), 

(2a*  -  1)F(  Wj.O)  +  (1  -  a*)F(  w„  S2°) 

—  (2a*-l)F(S*  ,0)  +  (l-a*)F(S*  ,S2*). 

If  the  segment  between  A  and  B  belonged  to  the 
boundary  of  the  capacity  region,  then  both  A  and  B 
would  attain  max{Ki  Ki)mCa*Rl+(l-  a*)R2.  However, 
this  is  not  possible  due  to  the  strict  concavity  of  the 
function  (2a  -  1)F(S1(0)+(1  -  a)F(S1(  Sj):  any  convex 
combination  of  the  spectral  pairs  (w1( Sf)  and  (SX*,S2*) 
will  achieve  strictly  higher  values  of  a*Rx  +(1-  a*)/?2 
than  A  and  B.  In  fact,  the  same  argument  can  be  em¬ 
ployed  to  show  that  the  transition  from  A  to  B  contains 
no  straight  lines. 

We  are  now  ready  to  state  and  prove  our  main  result 
•  concerning  the  capacity  region  of  the  asynchronous  Gauss¬ 
ian  multiple-access  channel  wherein  the  transmitters  ignore 
their  mutual  offset  t2  -  tx.  The  transmitters  only  know 
that  the  crosscorrelations  (p12»  P21)  ^at  parametrize  the 
channel  belong  to  an  uncertainty  set  T,  which  is  deter¬ 
mined  by  the  choice  of  the  signature  waveforms.  For 
example,  if  both  users  are  assigned  a  rectangular  waveform 
then  the  uncertainty  set  is  equal  to  the  segment  T  = 
(0  £  p12  £  1,  0  £  p2l  £  1,  p12  +  p21  - 1}.  Note  that  in  practi¬ 
cal  applications  it  may  be  of  interest  to  model  channels 
where  the  offset  is  not  the  only  source  of  uncertainty  for 
the  crosscorrelations;  for  example,  if  the  signature  wave¬ 
forms  are  sinusoidally  modulated,  the  crosscorrelations 
depend  on  the  relative  phase  between  the  carriers  (e.g.,  for 
rectangular  signals  modulated  by  sinusoids  whose  fre¬ 
quency  is  a  large  multiple  of  the  inverse  of  the  symbol 
period  T,  we  get  T  -  ±  (0  £  pl2  £  1,  0  £  p2,  £  1,  p12  +  p21 
£l}).  The  following  result  puts  no  restrictions  on  the 
source  of  the  uncertainty  of  the  set  of  cross  correlations  T. 

Theorem  2:  The  capacity  region  of  the  energy-con¬ 
strained  asynchronous  Gaussian  multiple-access  channel 
•where  the  transmitters  do  not  know  their  mutual  offset  is 
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given  by 


true,  and  we  need  to  follow  an  alternative  route,  suggested 
by  the  following  result 


I  \  I  Sj(  u) 

C*-  U  (f?„f?2),0  <;*,<;  —  /  log  1  +  -4- 

St(u)  aO.«G(-».*l  l  Hit  \  0 

t 

— K'J 

t  -1.2 


1 

da,  OzR2£  — 
4tr 


Sj(w)| 


+  inf 

(Pu-Pul G 


1  /« 

r^/j°8 


1  + 


Si(u)  S2(a)  S1(w)S2(«) 


[l  ~  P12  ~  P21  ~2Pi2P2i  cos  w]|  </w  j .  (3.53) 


Proo/-  Having  shown  the  result  for  the  special  case  of  a 
singleton  uncertainty  set  F  =  {(P|2.P2i)}*  we  ^11  ^  able 
to  proceed  at  a  faster  tempo  by  invoking  several  lemmas 
used  in  the  proof  of  Theorem  1.  The  capacity  of  the 
compound  decoder-informed  multiple-access  channel  with 
memory  can  be  shown  to  satisfy  ({7],  see  also  (26,  p.  288]) 

i  1  \ 

C*  =  closure  I  liminf  —C*  (3.54) 

\  n  — *  oo  n  I 

with 


Lemma  4:  Define  the  circulant  matrix 


0 

0 


S  - 


P-Jn  +  P21 


1  0 
0  1 
0  0 


1 


0 

1  0 
0  1 
0  0 


which  differs  from  S  only  in  the  (n,l)  entry  (cf.  (3.22)). 
Then  for  every  8  >  0,  and  n  >  ns  (independent  of  2lt 


u  n 

*r.*?o>l2.P2,)«r 


((R\,R\):  0^PJ^/(^;r-(p12,p21)|^) 

o<,r\<,  /(  at2";  r'(pl2,p2i)IXi) 

R^R^/(Xl\X;;Y"(ol2,p2l))j 


(3.55) 


where  Yn(pu,  p21)  denotes  the  output  of  channel  (2.4)  with 
crosscorrelations  (p12,p2i)»  and  AT"  and  X’  axe  indepen¬ 
dent  random  variables  satisfying  the  same  input  con¬ 
straints  as  in  (3.15).  This  follows  simply  because  the  direct 
coding  theorem  can  be  proved  using  codebooks  that  do 
not  depend  on  the  actual  channel  (via  random  selection) 
while  the  fact  that  for  reliable  communication  a  code  has 
to  be  good  no  matter  which  actual  channel  is  in  effect 
establishes  the  converse  theorem.  Using  Lemma  1  and 
proceeding  as  in  Theorem  1,  we  obtain  that 


c/-  U  {(RuR2),QzRlsl-\ozdct[ln  +  o-%} 

Xt7t0  \  L 


1 

-tr  2*  £  "i 
n 

k-1.1 


Q£R2£-  logdet  [/„  +  o"222] 


r,  +  r2 


1 


£  inf  —  logdet 

(Pti-Pu) 6 1*  2 


/j"  +  7 


;  *  HI 


(3.56) 


where  the  only  difference  with  respect  to  (3.23)  is  the 
minimization  of  the  rate-sum  constraint  with  respect  to  the 
cross-correlatioa-dependent  matrix  S. 

In  Theorem  1,  we  showed  using  the  singular-value  de¬ 
composition  theorem  that  a  set  of  eigenvectors  exists  that 
maximizes  the  three  constraints  in  (3.23)  no  matter  which 
eigenvalues  are  used,  thus  reducing  the  union  therein  to 
•ne  over  diagonal  matrices.  Here  this  property  is  no  longer 


S2,  p12  and  p21) 


1 

1 

[2.  0 

k  sT]\ 

-logdet 

n 

/*,+ o2 

0  22 
u  * 

S  7J. 

1 

- logdet 


1 

X 

0 

0 ' 

1. 

ST 

72-V 

n 

s 

V 

<5. 


(3.57) 


Therefore,  as  far  as  computing  lim._00(l/n)Q*  is  con¬ 
cerned,  we  can  substitute  S  by  S  in  (3.56).  The  effect  of 
this  substitution  is  to  introduce  an  artificial  interference 
term  between  symbol  1  of  user  1  and  symbol  n  of  user  2 
(Fig.  6):  resulting  in  a  channel  which  can  be  thought  of  as 
a  wrapped-around  version  of  channel  (2.1).  By  the  circular 
symmetry  of  this  new  channel,  we  can  intuitively  expect 
the  covariances  achieving  capacity  to  be  circulant,  and 
consequently,  the  existence  of  a  set  of  optimum  eigenvec¬ 
tors  (whose  components  are  powers  of  the  complex  roots 
of  untity)  which  do  not  depend  on  the  crosscorrelations. 
To  show  this,  it  suffices  to  write 


S-UDU*  (3.58) 

where  D  is  a  diagonal  matrix  of  the  eigenvalues  of  S, 
which  coincide  with  the  DFT  of  the  first  row  of  the 
circulant  matrix  S  (23] 


^*“Pi2  +  P2i«_y7r<*~1)/''»  Ar**l,- •  *,n  (3.59) 

and  0  is  the  orthonorraal  matrix  of  eigenvectors  of  S 
given  by 


olk 


_.-y2»(*-W-l>/» 


«'-l, •••,«,  k 


1,  ***,«- 

(3.60) 
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However,  note  that  the  right  side  of  (3.63)  can  be  written 
as 


1 


lim  max  inf 
"-w  ^  -  ~I  Z i)  <P|2-Pn>er  4jt 


'♦■*'•(3) 


•/  /(*i(«).*z(«).fl5(«))</“  (3.64) 


where 


2/  a  1)\  . 

p^uj-p' - - -  ,  for  u 1 


2w(/  — l)  2iri 
n  ’  n 


Fig.  6.  Circular  n-block  memoryless  two-user  channel. 


and  P„(wk/a2)  is  the  subset  of  P(wk/a2)  of  piecewise 
constant  functions  on  the  partition  [0,2w/n,-  •  •,  2 it).  Since 
f>l(u)  is  piecewise  constant  on  that  partition,  it  is  easy  to 
Using  the  decomposition  (3.58)  in  lieu  of  (3.24)  and  show  that  we  can  replace  P„(wk/a2)  by  P{wk/a2)  in 

Lemma  2  with  A  =*  D,  and  proceeding  in  a  way  similar  to  (3.64)  without  changing  the  maximum  value  for  any  n. 

the  proof  of  Theorem  1,  we  obtain  Finally,  (3.63)  follows  from  the  fact  that  for  every  « >  0, 


U  <(/?i,  R2),  0  <; Rl  £  —  logdet  [/B  +  e-2^] ,  0 £  R2  £  —  logdet [/„  +  o~%] 
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vnere 

-ia  ,  itr(i-l)  (2v{i-l) 

dl  I  '“Pl2  +  P2l+2p12Pj1COS. - “P2 - 

t  l  n 

3.62) 

\s  we  saw  in  the  oroof  of  Theorem  1,  the  convergence 
n  the  right  side  of  (3.61)  to  (3.53)  reduces  to 

i,a*  ,nLr  /(<I,i(")-4>2(“).P2(“))^ 

iPi2-Pu)«r 4 ttj.. 


-^3) 


-5  iim  max  inf  — 

.  —  oo  2  0. i « (p,].p],)sr  2n 


s  — 

;  o' 

-•1.2 

.  i 

'-/I  ‘Pw'hi’P* 

-i  \ 


2sr(i  — 1) 


i-1)  V 
n  II 


aid  n>  nf  (independent  of  <X»1,  $2,  pn,  and  p21) 

-  n 

£  2u,log(l+ — (3-65) 

Again  notice  that  since  the  capacity  region  is  achieved  with 
stationary  inputs,  Theorem  2  holds  regardless  of  whether 
or  not  the  transmitters  are  frame  synchronous. 

Corollary:  If  both  users  are  assigned  identical  wave* 
iorms  (and  they  do  not  know  their  mutual  offset),  then  the 
(3.63)  capacity  region  is  invariant  to  symbol  (and  frame)  asyn- 
chronism. 
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Fig.  7.  Symbol-asynchronous  Gaussian  capacity  region  when  transmit¬ 
ters  do  not  know  offset  between  their  signals. 

Proof:  Because  1  -  p\2  -  p21  -  2pl2p21  cos  «  2:  0,  it  is 
easy  to  see  that  the  asynchronous  capacity  region  (3.53) 
reduces  to  the  Cover-Wyner  region  (2.7)  if  the  uncertainty 
set  T  includes  either  (0,1)  or  (1,0).  This  occurs  when  both 
users  are  assigned  the  same  waveform. 

In  Fig.  7  we  can  see  the  asynchronous  capacity  regions 
corresponding  to  two  different  assignments  of  the  signa¬ 
ture  waveform:  a)  identical  signals,  resulting  in  the 
Cover-Wyner  pentagon,  and  b)  signals  that  are  orthogonal 
when  symbol  synchronous,  resulting  in  a  pentagon  with 
smooth  comers. 

IV.  Efficiency  Region  • 

A  fruitful  way  to  represent  the  multiple-access  capacity 
region  is  to  consider  the  effective  signal-to-noise  ratio  of  a 
user  who  transmits  at  rate  R,  which  we  define  as  the 
signal-to-noise  ratio,  y,  required  to  achieve  capacity  R  in  a 
single-user  channel,  i.e., 

T  “  exp(2R]-l.  (4.1) 

Since  the  mapping  in  (4.1)  'is  one-to-one,  the  rate  and 
the  effective  signal-to-noisc  ratio  give  the  same  informa¬ 
tion.  It  is  convenient  to  normalize  the  effective  signal-to- 
noise  ratio  with  respect  to  the  actual  signal-to-noise  ratio. 
This  results  in  the  performance  measure  we  refer  to  as 
efficiency7  rj,  which  is  a  parameter  ranging  from  0  to  1 
that  quantifies  the  performance  degradation  suffered  by 
each  user  because  of  the  presence  of  other  users  in  the 
channel.  Once  the  capacity  region  of  a  multiple-access 
nannel  is  known  it  is  immediate  to  obtain  the  efficiency 

*ji  analogous  performance  measure  was  defined  in  the  analysis  of  the 
umimum  uncoded  error  orobtbility  of  Gaussian  multiple-access  channels 
27], 
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Fig  8.  Efficiency  region  as  function  of  background  noise  level. 


region,  by  substituting  each  of  the  individual  rates  in  terms 
of  the  respective  efficiencies,  i.e., 

*,«jlog(l  +  ij,^).  (4.2) 

For  example,  it  follows  from  the  capacity  region  in  (3.6) 
that  the  efficiency  region  of  the  symbol-synchronous  chan¬ 
nel  is  equal  to 

0  £  ijj  £l,  0  £  i)2£l, 

a2  o2  \ 

^iT?2  —  (1  -  YJi)  —  -  (1  -  772)  ^ 1  -  p2 )  (4.3) 

w2  Wj  / 

where  recall  that  p  is  the  crosscorrelation  between  the 
assigned  waveforms. 

This  efficiency  region  is  illustrated  in  Fig.  8,  as  a  func¬ 
tion  of  the  background  noise  level.  For  low  signal-to-noise 
ratios  the  efficiency  region  occupies  nearly  all  of  the  unit 
square  because  the  main  mechanism  limiting  performance 
is  the  background  Gaussian  noise,  rather  than  the  multi¬ 
ple-access  interference.  Conversely,  it  is  apparent  from 
Fig.  8  that  for  moderate-to-large  signal-to-noise  ratios  the 
efficiency  region  converges  to  an  asymptotic  region  which 
quantifies  the  underlying  limitation  of  the  multiple-access 
channel  due  to  the  cross  correlation  between  the  assigned 
signal  waveforms.  The  region  in  (4.3)  admits  a  particularly 
simple  asymptotic  expression  as  the  noise  spectral  density 
goes  to  zero: 

E  ~  {(r?i,i?2):  0^  ij^l,  iWjSl-p2}. 

(4.4) 

The  usefulness  of  the  asymptotic  efficiency  region  is 
threefold:  it  provides  a  simple  way  to  characterize  multi¬ 
ple-access  capacity  in  high  signal-to-noise  ratio  situations; 
it  gives  a  lower  bound4  to  the  efficiency  region  achievable 
at  any  background  noise  level,  which  depends  only  on  the 
assigned  signal  waveforms  and  not  on  the  signal-to-noise 
ratios,  and  it  gives  an  intuitive  characterization  of  the 

ieriormance  degradation  in  a  multiuser  channel  in  terms 

*Il  is  shown  in  (14)  thal  the  efficiency  region  is  ctonotomcally  increas¬ 
ing  in  a1  (c I.  Fig.  8). 


ve*dO:  the  capacity  region  of  the  jymjol-asynchronous  gaussian  multiple- access  channel 
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Letting  o2->0,  the  asymptotic  efficiency  region  results: 


£  =  U  ‘  [(Vi,V2):  exp— -  f  logS,(«)dw,  ij2£  exp—  f  log S2(u)du 


—  fS,(u)d(J  - 1 


1  pf  1  |  .f  \ 

ij|ij2^exp  —  /  logS,(«)  du  exp  —  /  logS2(«)du  inf  exp  —  /  log(l-p2(w))du  >  (4.6) 
2-nJ-n  (Plj.pj,)er  2  irJ-„  '  j 


of  the  additional  power  required  to  achieve  single-user 
capacity. 

For  example,  suppose  that  the  users’  objective  is  to 
transmit  at  rates  /?,  and  R2,  respectively.  If  they  were 
operating  in  a  single-user  channel,  these  rates  could  be 
achieved  with  powers:  wk  =*  cr2(exp[2/?J-l),  Ac  — 1,2. 
However,  when  they  share  the  same  channel,  these  powers 
are  no  longer_sufficient  to  guarantee  reliable  communica¬ 
tion  at  rates  /?,  and  R2.  The  asymptotic  efficiency  region 

(4.4)  indicates  that  the  sum  of  their  powers  in  dB  has  to 
increase  by  -  101og(l  -  p2)  dB  and  that  the  way  the  users 
split  the  burden  of  increasing  their  powers  is  immaterial  as 
long  as  the  total  power  increases  by  the  prescribed  amount. 

In  the  conventional  scalar  multiple-access  channel,  which 
corresponds  to  the  users  being  assigned  the  same  wave¬ 
forms,  i.e.,  p  =  1,  the  asymptotic  efficiency  region  is  (via 

(4.4) ): 

{0  £  17,  «5l,  t?2  =  0}  U  { 7j,  *  0,  0  £  i|2  £  1} . 

Thus  when  the  signal-to-noise  ratio  is  high,  the  best  strat¬ 
egy  is  to  let  one  of  the  users  transmit  at  practically  full 
single-user  speed,  while  the  other  user’s  rate  is  kept  at  a 
very  low  level.  This  is  considerably  more  efficient  than 
time-division  multiple-access  (TDMA)  signaling  whose 
asymptotic  efficiency  is  equal  to  zero  for  both  users — 
although  if  both  rates  are  required  to  be  the  same,  then 
TDMA  is  indeed  almost  as  good  as  the  best  coding  for  low 
background  noise  (see  [28]). 

These  conclusions  do  not  hold  in  the  case  where  the 
assigned  waveforms  are  different  (|p]<l).  For  example, 
suppose  that  p  =  0.1  and  two  equal-rate  equal-energy  users 
with  signal-to-noise  ratio  equal  to  20  dB  transmit  at  the 
maximum  possible  rates.  Had  the  users  employed  TDMA, 
each  of  them  would  have  required  approximately  40  dB  to 
attain  the  same  rate.  Even  in  the  case  where  there  is  heavy 
cross  correlation  between  the  signals,  TDMA  is  not  near¬ 
optimum,  e.g.,  if  p  -  0.9,  then  TDMA  would  still  require 
33  dB  to  attain  the  same  rate. 

The  efficiency  region  of  the  asynchronous  Gaussian 
multiple-access  channel  is  (via  (3.53)  and  (4.2))  equal  to 


The  constraints  in  (4.6)  depend  on  the  spectral  densities 
only  through  their  geometric  mean;  therefore,  all  three 
constraints  are  maximized  simultaneously  by  a  single  pair 
of  spectral  densities  because  the  function  that  maximizes 
the  geometric  mean  subject  to  a  constraint  on  its  arith¬ 
metic  mean  is  constant.  Therefore,  (4.6)  is  equal  to  the 
efficiency  region  achievable  with  white  spectral  densities 
S*(w)  =  1,  o>e[-w, ir],  which  implies  that  white  inputs, 
while  not  optimum  in  general,  achieve  capacity  asymptoti¬ 
cally  as  the  background  noise  level  goes  to  zero.  Then  the 
asymptotic  efficiency  region  is  the  intersection  of  the  unit 
square  with  the  hyperbolic  region 


inf  exp  — 

(<>i2-P}i)eF  27 r 


'/  log[l-(p22  +  p21)-2p12p21cosw]  du 
J  -  * 


inf 

(pu-pji)  s  r 


/i-(p,2  +  p2i)2  +yi-(p12-p21)2 


where  the  definite  integral  is  found  in  [29,  p.  560]  (see  also 
[30,  p.  384]).  Note  that  this  result  generalizes  the  constraint 
,  1  -  P2  obtained  for  the  product  of  the  asymptotic  efficien¬ 
cies  in  (4.4)  when  the  users  are  synchronous.  Equation 
(4.7)  indicates  that  contrary  to  what  is  sometimes  assumed 
in  pseudonoise  sequence  design,  it  is  as  important  to 
minimize  the  difference  between  the  cross  correlations  as 
to  minimize  their  sum  (the  so-called  periodic  cross  correla¬ 
tion).  The  function  on  the  right  side  of  (4.7)  is  tightly 
approximated  by  l-p22-p21  for  low  cross-correlation 
values  such  as  those  in  Fig.  9,  where  we  can  see  the 
uncertainty  set  of  cross  correlations  between  two  carrier- 
modulated  spread  spectrum  waveforms  used  in  CDMA 
[31].  In  this  case,  the  minimization  in  (4.7)  is  attained  by 
the  rightmost  point  in  Fig.  9. 


I  (j2\  1  •*  02  /  <J2\  1  [K  <J2 

^P^)^,  +  ^]£eXP2^/-J^l^  +  5,(w)J^T2+^]£eXP^^wl0g[^  +  S2(w) 


;SJ(w)rfu-l 


7),+  — )(i?2  + — )  £  inf  exp—/"  log  (s^u)* — )  S2(u)  + — ]  -  p2(«)5j(w)S2(w)  (4.5) 

»i}\  *2/  (Putter  y2irJ.,  wlJ\  w2J 
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sufficiency  of  the  respective  outputs,  the  channels  in  (14)  and 
(1.3)  have  the  same  capacity  region. 

Appendix  II 

Proof  of  Lemma  1:  To  prove  the  identity 
det  [  /2„  +  e"2£[  A"'A"'r]  /f  ] 


QO  Q»  0  20  023 

Fig.  9.  Locus  of  cross  correlations  for  maximal-length  pseudonoise 
sequences  with  31  chips  per  symbol. 


Appendix  I 

To  simplify  the  coding  theorems  for  multiple-access  channels 
with  memory  invoked  to  find  the  capacity  region  of  the  asyn¬ 
chronous  Gaussian  channel,  it  is  convenient  to  obtain  a  discrete¬ 
time  channel  equivalent  to  (2.4)  and  whose  noise  process  is 
independent.  The  idea  is  to  obtain  a  set  of  sufficient  statistics 
that  are  independent  given  the  transmitted  symbols,  but  that 
unlike  those  in  (2.4)  are  not  a  minimal  set.  We  define  (cf.  Fig.  2): 


y{ 


(0-r')T+Tly(0*2(t-iT-T2)dt  (I.la) 

■'(/+t)ir+»l 


yj  (i)  -/<<+  ^ r,>(/)a2(/  — »r- t2)  c*.  (Lib) 

It  is  clear  from  (2.2)  and  (1.1)  that 

Mi)-y?{i)+yt{i)- 

Thus  the  set  of  quantities  {>,(*)}7-i.(»*(*)}7-i.  ^  {Ai  0}7-i 
are  sufficient  statistics  for  the  transmitted  messages.  To  obtain 
the  explicit  dependence  of  yf(i)  and  yf(i)  on  the  transmitted 
vmbok,  it  is  convenient  to  define  the  partial  energies  of  (r2(r)}: 


1 

r* 

3 

1 

■4 

(a 

MW 

(1-2*), 
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hus  +  e2  -1,  and  it  follows  bom  (11),  (14),  (LI),  and  (II) 
hat  we  can  write 
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(III) 


iv(i)  -  lU+l}T+\(t)s2(t-iT-r2)dt  (I.4a) 

'J+or+n 

•“(0  “  |u  +  1)r+T>n( t)j2(t-ir-T2)dt  (I.4b) 

T+t, 

he  channel  in  (1.3)  has  memory  because  of  the  dependence  on 
■revrous  inputs;  however,  since  the  random  process  {«(<)}  k 
vnite.  the  noise  sequence  in  (1.3)  in  independent  Because  of  the 


we  will  first  decompose  the  crosscorrelation  matrices  E[X* XmT\ 
and  R  using  the  Kronecker  product. 

-?.®[o  S]+I=®[2  ?]  <Iu> 

and 

2]+M2  ?]+*•[?  2H4S  l\ 

(n.3) 

It  k  straightforward  to  check  that  if  A,  B,  C,  D  arc  n  Xn 
matrices,  then 

HI  S]+*»[2  J]+C«[S  ?] 

+D®[?  ’cYt  <“■«> 

where  P  is  the  permutation  (orthogonal)  matrix  whose  only 
•onzero  entries  are 

7-1,  (U. 5a) 

^2y-2»y"l.  7"  «  +  !,••  •  ,2n.  (II.5b) 

’herefore,  (0.2)  and  (IL3)  can  be  written,  respectively,  as 

Lr 


R-P 


L  »  “2 

h  sT 

s  /. 


(n.6) 

(n.7) 


and  (0.1)  follows  from  (0.6),  (0.7),  and  the  orthogonality  of  P 
iDon  using  the  identity  det(/  +  ^BJ-  det(/+  /MJ. 

APPENDIX  III 

°roof  of  Lemma  2:  Define  the  following  diagonal  matrices: 
t  -  diag{cos  •  • ,  co$0„ },  G  -  diag{ e*  sm$u"  sin0„ } 
where  ff,  -  (1/2) arcsin(|J([)  G  (0.  */4)  and  3,  -  \&i\et*',  i  - 
It  k  straightforward  to  check  that 


\  F  €7*1  f  F  G*|*  fF  €7*V ...  f A* 
l  G  FilG  FJ  lG  Fj  [A  I, 


•  (01.1) 


“"fr-tss,  . 
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herefore.  we  can  write 
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■cti/2,  +  — 


A 

°1 

!'-  A*ll 

.0 

BJ 

Kf 

— Gf]pTp. 


'M  a  f'H 

III. 2) 


•.mere  P  is  the  orthogonal  matrix  introduced  in  Appendix  II.  It 
ollows  from  fll.4)  that 


F 

G 


G • 

F 


is  a  block  diagonal  matrix  whose  <th  diagonal  block  is  the  2x2 
matrix 


cos  #, 


sin#, 


e^'sin#,  cos#, 


whereas 


]'r 


,Ctl  hn  + 


1\A  0l[*.  A* 

cHo  bJ[a  /, 


I  I  {  det  /2  +  3 


1  I  cos#,  e’^'sin#, 
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cos  8, 
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e~*’  sin#,  1 

e*‘  sin#, 

cos#,  / 

1  { 1  +  aH  +  bll  +  fl<A0  ~  l*/l)2}  • 


(III-3) 


that  maximizes  the  strictlv  concave  Lagrangian  (if  p2  <1.  then 
/(* i.  Zj.P2)  is  strictly  concave  in  (z„  z2)) 

L(*t.*2)-f  /(<j,i(w).'I,2(w).P2(«))^« 

-^[/'^t(«)^-^l]-^[£^:(«)</«-5]  (IV.l) 

'or  some  pair  of  positive  Lagrange  multipliers  (#,,#2)  selected  so 
that 


rr-j  <J>*(w)du--4,  k  =”1,2. 

2  IT'-*  O 


(IV  .2) 


The  optimum  pair  (4»,*, «J»2* )  is  the  unique  element  in  the  cone  of 
nonnegative  function  pairs  that  satisfies10  (e.g.  (33,  p.  227]) 

S /,  ( 4>,*  ( «) .  <J>2*  ( u) ,  pJ  ( u)) 

2a  —  1 


!  +  <!>*(  w) 


(l-a)[l  +  «2*(w)(l-pJ(w))] 


1 +<!>,*(  w)  +  <&2*(  «)  +  $*(  «)$2*(  «)(l  -  P2(  «)) 


is  a  block  matrix  whose  tth  diagonal  block  is 
p„  0 

o  K 

Because  of  the  assumption  on  the  magnitude  of  the  elements  of 
A,  the  hermitian  matrix 

r  /„  a*i 
A  / . 

s  nonnegative  definite  and,  therefore,  so  is  the  matrix  in  the  right 
4ae  of  (III.2).  This  allows  us  to  apply  the  generalized  Hadamard 
neauaiity  [18]*  (the  determinant  of  a  positive-definite  matrix  is 
'Dper-oounded  by  the  product  of  the  determinants  of  its  diago- 
■ai  blocks)  to  the  matrix  in  the  right  side  of  (m.2);  the  result  is 
he  ineauality 


(IV  .3) 

(IV.4) 


6i  ^ h (*•(«).' ("). PJ( ")) 

_ (1  -  a)  [l+1>,*(  «■>)(!  ~P2(  «■>))] _ 

"  1  +  <tf  *  (  u)  +  <!>•  (  w)  +  4>;  (  u)  1>2*  (  u)  (1  -  p2(  u)) 

with  equality  if  <&,*(u)  >  0  and  <&*(u)  >  0,  respectively;  here  /, 
and  f2  are  the  partial  derivatives  of  /  with  respect  to  its  first  two 
arguments.  It  follows  from  the  second  condition  that 


<J>2*(u)  -  max{ 


(  l-< 


1  +  $,•(«) 


.0  (IV.5) 


\  l  +  *i*(w)(l-p2(w)) 

which  implies  that 

I-a>(i2  (IV  .6) 

tor  otherwise  Q{(u)-0  for  all  which  does  not 

satisfy  (IV.2).  Let  us  now  see  what  conditions  force  each  of  the 
solutions  to  be  zero  at  a  particular  frequency. 

On  the  one  hand,  if  0,*(i«\))  "0,  then  (IV.5)  and  (IV.6)  imply 
that 


I -  a 

— -r~l 

o 

which  when  substituted  in  (IV.3)  results  in 


p'K)^ 


a  — 


(IV.7) 


(IV -8) 


inallv,  it  is  immediate  to  check  that  (3.31)  is  satisfied  with 
ouaiity  if  A  and  8  are  diagonal  matrices. 

\PPENDIX  IV 

’ roof  of  Lemma  3:  The  proof  involves  the  solution  of  the 
naximization  in  (3.40),  thus  yielding  an  explicit  way  to  compute 
»e  capacity  region  in  (3.13).  The  maximum  on  the  left  side  of 
3.40)  is  achieved  by  the  pair  of  nonnegative  functions  (<!>,*,  <l>2* ) 


(1  -  a)  -  #2 

One  conclusion  that  can  be  drawn  from  condition  (IV  8)  is  that 

a  >8,  (IV  .9) 

because  otherwise,  $,*(«)  -  0  for  till  w  e  [  -  ir,  v ). 

">n  the  other  hand,  if  -  0,  then  (IV.3)  and  (IV.9)  imply 
that 


'*’*(“)))  T-1 


(IV.10) 


*.monR  the  many  existing  proofs  of  the  Hadamard  inequality.  Cover 
■no  El  Gamal  132]  have  given  a  simple  informauon-theoreuc  proof.  One 
>t  the  nice  features  of  that  proof  u  that  it  can  be  immediately  extended  to 
•rove  uie  generalized  Hadamard  inequality. 


i0If  IpuI+IpjiI-1,  then  for  either  <*,  - 0  or  -  ir.  and 

uniqueness  of  (<t>*,4>2*)  is  guaranteed,  except  in  the  set  { }  of  measure 
zero,  because  /(Zi.Zj.l)  is  not  strictly  concave. 
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which  upon  substitution  in  (IV.5)  results  in  the  condition 


„  x.  a  (!-«)-« 2 


(IV.11) 


Note  that  since  l/2^a<l,  (IV.8)  and  (IV.10)  cannot  be  true 
simultaneously  if  p2^)  <1.  However,  they  can  indeed  be  false 
simultaneously,  in  which  case  (IV.3)  and  (IV.4)  are  satisfied  with 
equality  and  ($,*(«), <&2*(w))  are  the  positive  soluu'ons  to  the 
system  of  the  following  equations: 

„  1  -a  l  +  <t>,*(u) 

«■>■!■««¥*)  <IV12) 


1 +  <!>,*(«) 

+ _ (I  -  g)[l  +  '^>2*(1*))(l  ~  P2(fa)))] _ 9 

1  +  to)  +  <Pj(u)  +  <t>*(u)<t>2*(u)(lr  -p2(u))  1 

(IV.13) 

It  follows  that  for  each  fixed  pair  of  Lagrange  multipliers,  the 
maximizing  spectra  (<1>1*(m),<1>2,(u))  depend  on  u  only  through 
p^u),  i.e.,  Q*(u)  —  y*(p2(m),01,02),  which  is  a  continuous  func¬ 
tion  of  p2(u). 

Appendix  V 

Proof  of  Lemma  4:  Let  £  be  the  2n  x2n  matrix  whose  only 
nonzero  elements  are  £12,  -  ,  -1,  and  let  Z,  be  the  nonneg¬ 

ative-definite  square  root  of  2,.  Then  we  can  write 


10gdet[/2"  +  ?[o  22][ 

1  f  2 

-log del  4,  +  —  , 
0  L 

-  logdet[/2„  +p21A/£] 


2,  0 

0  22. 

\S  '.]] 

1 

s,  o  ir 

/j«  +  7 

0  22j[ 

k  sr[ 
Xs  l*\. 


z,  oir  i 
0  4“+^ 


The  determinant  in  the  right  side  of  (V.l)  is  easily  computed: 

det(4  +  fh\ME]  "1  +  pii^u*  +  2p21Wl2,  “Pl2  ^11^2*2* 

il  +  2p21A/l2, 

^  1 +  2|Pj,|/M„  A4«2* 


vnere  the  first  two  ineaualities  follow  from  the  nonnegative 
■efiniteness  of  Af.  and  the  third  inequality  is  a  consequence  of 
he  fact  that,  according  to  (V. 2),  the  largest  eigenvalue  of  M  is 
iDper-oounded  by  that  of 

,  °] 

0  22 

vnich  is  in  turn  UDper-bounded  by  (tr2,  +  tr22)/o2  5(w,+ 
)/oi.  Interchanging  the  roles  of  S  and  S,  the  same  bound  can 


be  shown  for  the  reverse  difference  in  (V.l),  i.e., 


1  /  w.  +  hs  \ 

£-log(l  +  -~22n),  (V.4) 

completing  the  proof  of  Lemma  4. 
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Abstract— In  this  paper  we  focus  on  two  areas  of  communication 
network  design  in  which  methods  of  control  and  optimization  theory  have 
proven  useful.  These  are  the  area  of  multiple  access  communication  (for 
networks  with  shared  links  such  as  radio  networks  and  local  area 
networks)  and  the  area  of  network  routing  (for  networks  with  point-lo- 
point  interconnections).  We  review  a  few  selected  problems  in  each  area 
to  show  the  role  of  the  control  concepts  involved  and  we  then  proceed  to 
identify  other  areas  of  communication  network  design  in  which  the  same 
control  theoretic  and  optimization  methodology  may  be  applicable  and 
useful.  We  do  not  survey  the  work  done  in  this  area,  nor  do  we  review 
work  in  control  areas  whose  methods  are  applicable  in  other  communica¬ 
tion  network  problems.  Instead,  we  attempt  to  bring  to  the  attention  of 
the  control  systems  community  the  numerous  instances  of  problems 
arising  in  the  pure  communication  network  design  process  that  can 
benefit  from  ihe  attention  and  the  capabilities  of  this  community. 


I.  Introduction 

COMMUNICATION  networks  are  designed  and  built  in  order 
to  share  resources  If  interconnecting  systems  and  bandwidths 
were  available  at  no  cost,  then  the  solution  to  the  problem  of 
communication  would  be  to  assign  dedicated  communication  links 
(channels)  of  sufficient  capacity  to  every  pair  of  conceivable  users 
to  meet  their  needs.  This  not  being  the  case,  it  is  necessary  to 
multiplex  the  sources  of  communication  traffic  in  order  to 
optimize  various  cost  criteria.  Frequently,  this  optimization  is 
dynamic  and  done  on  the  basis  of  feedback  that  monitors  the 
evolution  of  the  degree  of  utilization  of  the  network  resources. 
Thus,  we  should  expect  a  number  of  problems  arising  in 
communication  network  design  to  fit  naturally  in  the  framework 
of  control  systems  design  In  this  paper  we  wish  to  demonstrate 
that  indeed  this  is  the  case  and  to  show  how  various  control  and 
optimization  methodologies  have  been  used  in  the  study  of 
communication  networks. 

In  the  beginning  there  was  a  single  communication  network,  the 
telephone  network  It  represented  a  multibillion  dollar  investment 
and  seemed  to  serve  reasonably  adequately  the  voice  communica¬ 
tion  needs  The  explosive  growth  in  data  communication  needs 
during  the  last  30  years  built  up  the  pressure  for  additional  and 
alternative  networking  options.  As  a  result,  the  notion  of  store- 
and-forward  switching  (known  also  as  message  switching)  was 
introduced  in  the  early  1960’s.  This  notion  represented  a 
breakthrough  since  it  constituted  a  radical  reversal  of  thinking 
with  respect  to  the  circuit-switching  process,  namely,  instead  of 
securing  an  open,  dedicated  "pipe”  for  the  transmission  of 
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messages  by  means  of  hardware  switches,  it  allowed  a  step-by- 
step  (node-by-node)  forwarding  of  messages,  thereby  permitting 
each  node  to  switch  messages  by  deciding  when  and  where  to 
transmit  the  messages  in  its  buffer.  In  the  last  20  years  we  have 
seen  an  avalanche  of  technologies  (fast  switching,  time  division 
multiplexing,  local  area  networks,  fiber  optical  networks,  inte¬ 
grated  services  digital  networks,  etc.)  and  a  proliferation  of 
operational  public  and  private  networks  that  put  these  technolo¬ 
gies  to  test  and  challenged  communication  engineers.  In  addition, 
they  should  challenge  control  engineers  as  well. 

Without  attempting  a  survey  of  this  vast  application  area  we 
wish  to  promulgate  the  viewpoint  that  many  (if  not  most)  specific 
sub-problems  in  the  network  design  process  are  natural  control 
problems.  In  support  of  this  thesis,  we  choose,  first,  to  demon¬ 
strate  how  two  major  areas  in  communication  networks  (routing 
and  multiple  access)  have  benefitted  from  the  use  of  techniques 
borrowed  from  what  is  traditionally  perceived  as  control  systems 
methodology  and,  second,  to  mention  additional  areas  that  are 
likely  to  benefit  from  the  control  systems  community.  As 
illustrated  in  this  paper,  the  techniques  that  have  proved  useful  in 
communication  networks  include,  dynamic  programming  (e.g., 
[2],  [6),  (8]-[10],  [22],  [29],  [38],  [39],  [47],  [49],  [54]);  linear 
programming  (e.g.,  [50],  [51]);  constrained  and  iterative  optimi¬ 
zation  (e.g.,  [5],  [14],  [16],  [42]);  Markov  decision  theory  tools 
(e.g.,  [2],  [26],  [29],  [38]);  control  of  Markov  chains  (e.g.,  [11], 
[17],  [18],  [20],  [40],  [45]);  stability  analysis  of  stochastic 
systems  via  Lyapunov  methods  (e.g.,  [31],  [43]);  sample  path 
dominance  (e.g.,  [2],  [52]);  and  convergence  of  distributed  and 
asynchronous  algorithms  (e.g.,  [6],  [16],  [42]). 

The  problem  of  routing  is  encountered  in  all  and  every  network 
that  does  not  permit  the  source  to  reach  the  destination  in  a  single 
transmission  hop,  but  instead  it  must  traverse  a  path  of  intermedi¬ 
ate  links.  By  contrast,  the  problem  of  multiple  access  is 
encountered  primarily  in  those  networks  that  permit  the  nodes  to 
reach  their  destination  directly  in  one  hop  by  having  to  share  the 
same  link  with  other  transmitting  nodes.  In  addition,  the  two 
problems  are  fundamentally  different  in  nature  and,  jointly,  cover 
considerable  ground  in  the  networking  area.  Finally,  together  they 
facilitate  the  identification  of  additional  design  issues  and  the 
extension  of  the  applicability  of  suitable  control  methods.  Thus, 
they  represent  “cornerstone”  areas  of  network  design. 

Routing  can  be  studied  either  macroscopically  or  microscopi¬ 
cally.  The  macroscopic  viewpoint  considers  basically  a  flow 
model  and  determines  the  splitting  of  the  flow  in  order  to  reach  the 
destination  in  minimum  time  with  efficient  use  of  the  network 
resources.  It  is  traditionally  referred  to  as  static  routing.  The 
microscopic  viewpoint  dissects  the  flow  process  down  to  the 
atomic  level  of  the  individual  transmission  unit,  the  message  (a 
string  of  bits  commonly  referred  to  as  packet),  and  determines  the 
path  each  message  must  follow  at  each  of  its  hops  through  the 
network.  It  is  traditionally  referred  to  as  dynamic  routing.  Both 
viewpoints  are  explored  in  Section  II. 

Multiple  access  is  a  collective  term  that  refers  to  numerous 
problems  that  deal  with  the  dynamic  allocation  of  a  single 
resource  among  users  who  can  coordinate  their  use  of  that 
resource  only  by  making  use  of  that  resource.  These  problems 
arise  primarily  in  the  context  of  radio  channels  but  also  in  the 
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context  of  shared  cable  resources  in  local  area  networks.  In 
Section  III,  we  explore  the  main  multiple  access  problems  where 
control  methods  have  been  successfully  applied. 

Both  in  the  case  of  routing  as  well  as  in  the  case  of  multiple 
access  we  place  the  emphasis  on  the  control  techniques  that  have 
been  used.  We  then  show  how  these  techniques,  sometimes  with 
slight  modification,  can  be  naturally  transported  to  other  problem 
areas  such  as  voice-data  integration,  flow  control,  and  the 
scheduling  of  messages  and  links.  This  is  done  in  Section  IV. 

II.  Network  Routing 

The  problem  of  routing  in  communication  networks  is  one  that 
has  received  early  attention  and  has  experienced  significant 
breakthroughs  in  the  brief  history  of  the  field  of  communication 
networks.  It  is  one  of  the  first  problems  that  gained  prominence  as 
a  result  of  the  emergence  of  store-and-forward  switching.  It  is  also 
one  in  which  analytical  tools  and  available  theories  applied  nicely 
from  the  beginning. 

A.  Static  Routing 

Given  a  network  (a  set  of  nodes  connected  by  directed  links)  a 
path  connecting  the  source  node  to  the  destination  node  has  to  be 
selected  from  the  set  of  all  possible  such  paths. 1  In  the  simplest 
ormulation.  the  problem  is  one  of  finding  the  shortest  path,  i.e.,  a 
ength  is  assigned  to  each  link  and  the  optimization  criterion  is  the 
otal  path  length.  This  problem  is  one  of  the  archetypical 
omoinatorial  optimization  problems  (the  solution  can  be  found  by 
•xnaustive  enumeration  of  a  finite  set  of  possibilities— all  possible 
aihs  from  source  to  destination).  Among  the  many  existing 
nortest  oath  algorithms  (see,  e.g.,  [41]),  the  Bellman-Ford 
algorithm  (1956)  is  of  particular  interest  to  our  exposition,  both 
because  it  is  based  on  dynamic  programming  and  because,  as  we 
will  see  below,  it  easily  lends  itself  to  distributed  asynchronous 
implementation  A  natural  choice  to  find  the  shortest  path  from 
source  to  destination  in  a  layered  network  (i.e.,  one  in  which  the 
nodes  can  be  grouped  in  subsets  Uu  •  •  •  (JM  such  that  the  source 
and  destination  nodes  belong  to  Ux  and  UM,  respectively,  and 
there  are  links  only  between  nodes  in  adjacent  layers  Uk.\  and 
Uk)  such  as  the  one  in  Fig.  1,  is  the  dynamic  programming 
algorithm,  where  the  shortest  paths  and  distances  (costs-to-go)  of 
the  nodes  in  layer  (/*  to  the  destination  are  computed  based  on  the 
shortest  paths  and  distances  of  the  nodes  in  layer  (/*+!•  If  the 


1  All  the  algorithms  and  results  discussed  in  ihis  section  can  be  extended  lo 
the  case  where  there  are  several  source-destination  pairs  in  the  network. 


Fig.  2.  Arbitrary  network  showing  link  lengths.  Source  is  node  1  and 
destination  is  node  5. 

network  is  not  layered  (such  as  that  in  Fig.  2),  its  shortest  path  can 
be  obtained  by  finding  the  shortest  path  in  a  layered  network 
derived  from  the  original  one  as  specified  in  the  Bellman-Ford 
algorithm:  the  number  of  layers  is  equal  to  the  number  of  nodes  in 
the  original  network,  say  N,  each  layer  contains  a  copy  of  each  of 
the  N  nodes,  and  there  is  a  link  connecting  two  nodes  in 
consecutive  layers  if  such  a  link  exists  in  the  original  network,  in 
addition,  copies  of  the  same  node  in  consecutive  stages  are 
connected  by  a  zero-length  link.  (Fig.  1  was  actually  derived  from 
Fig  2  using  this  rule  )  It  is  easy  to  see  by  induction  that  £>*(/),  the 
cost-to-go  of  node  i  in  layer  N  -  A: ,  is  the  minimum  length  of  any 
path  from  /  to  the  destination  that  uses  at  most  k  links  (in  the 
original  network).  Since  no  shortest  path  uses  more  than  N  -  1 
links  (link  lengths  are  assumed  nonnegative  and,  therefore,  no 
path  containing  loops  need  be  considered),  the  cost-to-go  of  node  /' 
at  layer  1,  DN„{{i)  will  indeed  be  the  length  of  the  shortest  path 
from  node  i  to  the  destination.  Thus,  the  Bellman-Ford  algorithm 
can  be  formulated  as  the  iteration 

£>*(/)=  min  lDt.dj)  +  du]  for k=  1,  •  •  •  N- 1  (2.)) 

yEN(i) 

where  d,t  is  the  length  of  the  link  from  /  to  j,  N(i)  is  the  set  of 
nodes  for  which  such  a  link  exists  and  it  is  assumed  that  D0(i)  = 
oo  if  i  is  not  the  destination  node,  which  corresponds  to  the 
removal  of  all  the  nodes  but  the  destination  in  the  final  layer  (Fig. 
1). 

Contrary  to  what  may  appear  at  first  glance  there  is  a  lot  more 
to  network  routing  than  finding  shortest  paths.  After  all,  the 
shortest  path  may  not  be  the  best  path.  The  reason  is  that  the  real 
goal  is  to  minimize  the  delay  experienced  in  going  from  source  to 
destination,  and  the  delay  encountered  in  each  link  is  usually  a 
function  of  the  amount  of  traffic  carried  by  the  link  (as  the  link 
becomes  congested,  it  takes  longer  to  go  through  it),  which  is 
referred  to  as  the  link  flow  and  is  quantified  in  packets  (or 
messages)  per  second.  Then,  assuming  a  given  desired  flow  level 
from  source  to  destination,  the  problem  is  how  to  distribute  it 
among  all  the  possible  paths  so  as  to  minimize  the  total  delay.  In 
contrast  to  the  previous  more  elementary  formulation  of  the 
routing  problem  which  led  to  the  shortest  path  combinatorial 
optimization  problem  and  which  corresponds  to  the  special  case  in 
which  the  link  delays  are  independent  of  the  flows,  we  now  face  a 
continuous  optimization  problem  which  can  be  written  as 

minimize  F(x)  =  ^  ^  x(n) ) 

iij)  \nem-j)  / 

subject  to  x  S  X=  j^(x(l),  ■••x(J))  6  RJ, 

2  x(n)  =  X,x(n)>o|  (2.2) 

n*  I  ) 
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Fig  3  Characterizalion  of  the  solution  to  the  minimum-dela)  routing 
problem. 


where  the  set  of  all  paths  from  source  to  destination  is  labeled  { 1 , 
•••,  J};  x  -  (je(l),  •••,  x(J))  is  the  vector  of  unknown 
nonnegative  path  flows  which  sum  up  to  X,  the  desired  flow  from 
source  to  destination;  P(i,j )  C  {1 ,  •  ■  ■ ,  J }  is  the  subset  of  paths 
that  traverse  link  (;,  j);  and  Dy(x)  is  the  portion  of  the  overall 
delay  contributed  by  die  link  from  node  /  to  node  j  when  the  flow 
it  carries  is  equal  to  *.  In  order  to  characterize  a  global  solution  to 
the  optimization  over  a  convex  set  in  (2.2),  it  is  natural  to  restrict 
attention  to  convex  penalty  functions.  In  practice,  it  is  common 
that  the  incremental  delay  in  a  link  grows  with  the  amount  of 
traffic  it  carries  and,  therefore,  it  can  be  assumed  that  the 
functions  D ,,  are  convex  without  affecting  significantly  the 
practical  applicability  of  the  results. 

Now,  the  characterization  of  the  solution  to  (2.2),  x*,  is 
straightforward.  Since  the  feasible  set  X  and  the  penalty  function 
F  are  convex,  it  is  necessary  and  sufficient  that  the  directional 
derivative  of  the  penalty  function  be  nonnegative  when  evaluated 
at  x*  in  the  direction  of  any  of  the  elements  of  X  (e.g.,  {37]) 

Oslim  —  [F((l  -a)x*  +  ax)-F(jr*)]  V*  €  X  (2.3) 

alO  a 

which  translates  into 

°^XDu(  2  x*™)  2  (*(«)-**(/!)] 

(rj)  v  m€P(ij)  J  neP(ij) 

J 

-  l*(rt)  -  **(«)]  dx%(ri)  for  all  jc  6  X  (2.4) 

n  -  I 

where  dx(n)  =  St(.>)eiwDJ(Im6fl,,y)x*(m))  is  the  length  of  path 
o  when  the  length  of  each  link  is  equal  to  the  derivative  of  its  delay 
evaluated  at  the  set  of  flows  a:,  and  L(n)  is  the  set  of  links  used  by 
path  n.  The  solution  to  (2.4),  x*,  is  the  vector  in  X  that  minimizes 
its  inner  product  with  the  vector  of  distances  .  Thus,  x *  puts 
all  its  weight  on  the  smallest  component(s)  of  tf,*  The  conclusion 
is  that  the  optimum  flow  uses  only  shortest  paths  computed 
according  to  the  derivative  of  the  link  delays. 

This  solution  to  the  minimum-delay  routing  problem  allows  us 
to  check  whether  a  given  set  of  flows  is  optimum.  Unfortunately, 
it  does  not  tell  us  how  to  And  the  optimum  flows.  Indeed,  we  face 
*dte  chicken-and-egg  situation  depicted  in  Fig.  3.  The  optimum 
flows  are  obtained  by  solving  a  shortest  path  problem;  but  in  order 
to  compute  the  link  lengths  it  is  necessary  to  know  the  optimum 
flows.  Nevertheless,  the  foregoing  characterization  of  the  optimal 
solution  does  suggest  a  possible  iterative  procedure  to  find  the 
optimum  set  of  flows.  Starting  with  a  given  set  of  flows  jc  one  can 
compute  the  minimum  derivative  shortest  paths  for  that  flow,  and 
hence,  a  new  flow,  x*(x)  that  is  positive  only  along  those  shortest 
paths.  The  process  can  then  be  repeated,  until  there  is  no 
appreciable  cost  decrease.  The  region  of  convergence  of  such  a 
procedure  can  be  improved  by  letting  the  new  flow  be  a  convex 
combination  of  x  and  or*(jr),  i.e., 

xk+ 1  =  (1  -  ak)xk  +  a*j:*(j:*). 

This  is  the  so-called  flow  deviation  method  of  Fratta,  Gerla,  and 
Kleinrock  {14],  where  0  <  ok  <  1  is  chosen  to  minimize 


which  is  a  special  case  of  the  feasible-direction  nonlinear 
programming  algorithm  due  to  Frank  and  Wolfe  [13].  The 
convergence  of  the  flow  deviation  method  to  the  optimum  routing 
is  rather  slow  because  unfavorable  paths  tend  to  carry  considera¬ 
ble  flow  during  many  iterations  unless  the  initial  routing  guess  is 
particularly  fortuitous.  Such  a  behavior  can  be  improved  by 
reducing  the  flow  along  each  nonminimum  derivative  path  in 
accordance  to  the  delay  experienced  in  that  path.  This  is  the  idea 
of  iterative  routing  algorithms  based  on  gradient  projection 
nonlinear  optimization  methods  (e.g.,  [4])  in  which  the  flow 
decrease  along  a  nonminimum  derivative  path  is  proportional  to 
the  difference  between  its  length  and  that  of  the  shortest  path 
(according  to  the  first  derivative  of  the  delay  function).  If  such  a 
decrease  would  result  in  a  negative  flow,  then  the  flow  along  that 
path  is  set  to  zero  (hence,  the  projection  to  the  set  of  feasible 
flows). 

We  have  seen  that  the  problem  of  static  network  routing  can  be 
formulated  as  a  conceptually  straightforward  optimization  prob¬ 
lem  that  admits  well-known  solutions  in  nonlinear  programming. 
What  sets  optimum  routing  in  communication  networks  apart 
from  other  multicommodity  flow  problems  arising  in  operations 
research  is  the  fact  that  the  optimization  is  carried  out  in  real 
time,  and  often,  in  distributed  fashion,  where  each  node  makes  its 
own  routing  decisions  based  on  local  information.  The  review  of 
centralized  routing  has  revealed  that  the  shortest  path  problem 
plays  a  central  role  in  solving  for  the  optimum  routing  regardless 
of  whether  the  link  congestion  measures  depend  on  the  link  flow 
or  not.  Hence,  we  will  start  the  exposition  of  distributed  routing 
algorithms  by  discussing  the  distributed  version  of  the  Bellman- 
Ford  shortest  path  algorithm. 

The  Bellman-Ford  updating  equation  in  (2.1)  suggests  that  the 
algorithm  is  suited  for  decentralized  operation  because  each  node 
can  update  its  own  estimate  of  distance  to  the  destination  (cost-to- 
go)  provided  it  receives  from  its  neighbors  their  own  estimates 
[appearing  on  the  right-hand  side  of  (2.1)].  The  feature  that  makes 
the  study  of  the  distributed  Bellman-Ford  algorithm  interesting  is 
that  it  can  run  completely  asynchronously,  in  the  sense  that  the 
updating  and  communication  times  need  not  be  coordinated  and 
convergence  can  be  guaranteed  by  simply  assuming  that  updating 
and  communication  between  nodes  never  cease,  without  any 
requirements  whatsoever  on  the  rate  of  communication.  The  proof 
of  convergence  is  a  nice  illustration  of  the  analysis  of  decentral¬ 
ized  algorithms  where  the  processors  are  allowed  to  perform  their 
computations  and  to  communicate  the  corresponding  results 
completely  independently  of  one  another  [5],  [6].  The  idea  is  to 
show  that  the  estimates  computed  in  the  distributed  asynchronous 
algorithm  are  always  sandwiched  by  the  estimates  computed  by 
the  centralized  version  of  the  algorithm  when  started  at  two 
different  initial  conditions,  and  that  both  centralized  estimates 
converge  to  the  true  distances  to  the  destination  node. 

Those  centralized  estimates  are  denoted  by  Dk  -  (Dk(\),  •  •  • , 
Dk(N))  and  Dk  =  (£>*(1),  •••,  Dk(N)),  and  are  the  result  of 
the  centralized  Bellman-Ford  iteration  (2.1)  when  it  is  started 
with  initial  conditions  Do  =  (oo,  •  •  •,  oo,  0)  and  Do  -  (0,  •  •  \ 
0),  respectively.  (The  destination  node  is  assumed” to  be  the  Mh 
node.)  Define  the  operator  [see  (2.1)] 

B,[Dk]=  min  [DkU)  +  d0] 
yew> 

=Dktl(D  (2.6) 

if  1  S  /  <  N,  and  B^[Dk]  -  Dk(N).  This  operator  is  monotone 
in  the  sense  that  if  D  <  D*  (i.e.,  if  D(i)  D*(i),  /=!,••• 
N),  then 


B,ID]SB,[D*\.  (2.7) 

The  monotonicity  of  Bi  implies  that 


£((!-«*)**  +  «***(**)) 


Qk^Q  k*\^Dk+\^Dk 


(2.8) 
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and,  moreover,  it  is  easy  to  show  that  for  sufficiently  large  k 

Qk=Ds-\=Ok  (2.9) 

which  is  the  vector  of  distances  from  each  node  to  destination  as 
we  saw  in  the  discussion  of  the  centralized  algorithm. 

In  the  asynchronous  distributed  version  of  the  algorithm,  it  is 
assumed  that  each  node  /  keeps  at  time  t  >  0  an  estimate  of  its 
distance  to  destination  A,(i ),  and  an  estimate  of  the  distance  from 
each  of  its  neighbors  j  €  N(i)  to  destination  A\(j ),  which  is 
simply  the  latest  estimate  received  from  node/.  In  view  of  (2.8) 
and  (2.9),  convergence  of  the  algorithm  will  follow  if  we  show 
that  for  every  index  k,  there  exists  a  time  4  >  0  such  that  for  all  / 
a  4 

pk<A,*Dk  (2.10) 

and  for  i  =  1,  •  •  *,  N  -  1 

PkU)^A',U)sDkU)  j  e  N(i).  (2.11) 

This  is  shown  by  induction.  If  k  =  0,  then  (2. 10)  and  (2.11)  hold 
as  long  as  the  initial  estimates  of  the  decentralized  algoritlim  are 
nonnegative.  Assuming  that  the  induction  hypothesis  is  true  for  k, 
the  monotonicity  of  B,  implies  that  if  l  S  4,  then 

Z>*+ ,  (i)  =  B,[Dk)  S  B,  [A  j]  sB,[Dk)  =  .  (f  )•  (2. 12) 

But  A,(i)  is  a  piecewise  constant  function  of  time  which  only 
jumps  at  the  updating  times  of  node  i,  at  which  times  it  takes  the 
value 

At(i)= Bt[A  {]. 

Therefore,  we  can  write 

pk+l(i)*At(i)£Dk+l(i)  for /a 40)  (2.13) 

where  4(1 )  is  the  smallest  updating  time  of  node  i  which  is  greater 
than  4.  Moreover,  if  we  wait  long  enough  aftei  max(  4(0,  not 
only  all  the  nodes  will  have  earned  out  their  first  updates  after  4 
but  the  result  of  those  computations  will  have  been  communicated 
to  their  neighbors  because  of  the  assumption  that  updating  and 
communication  occur  infinitely  often.  Hence,  there  exists  4+1  ^ 
max(  4(0  such  that  for  all  /  a  4fl  and  for  all  i  and  j 

A‘(j)=A,(J) 

for  some  s  2:  4(y )  (which  depends  on  t,  i,  and  /').  Thus,  it 
follows  from  (2.13)  that 

Pk+iU)*A‘,U)sDk+iU)  j  €  Af(i)  i=l,  /sf-1 

completing  the  induction  proof  and,  therefore,  the  proof  of 
convergence  of  the  distributed  asynchronous  Bellman- Ford  al¬ 
gorithm. 

When  the  link  delays  depend  on  the  traffic  flows,  it  is  also 
possible  to  obtain  the  optimal  routing  that  solves  (2.2)  in  a 
distributed  asynchronous  fashion.  Gradient  projection  algorithms 
are  better  suited  for  this  task  than  the  flow  deviation  method 
because  in  the  latter  method  a  higher  degree  of  synchronization  is 
required  in  order  for  the  nodes  to  use  the  same  step  size  at  each 
iteration.  In  the  distributed  asynchronous  implementation  of 
gradient  projection  optimum  routing  algorithms,  each  node 
broadcasts  from  time  to  ume  the  values  of  its  outgoing  flows  to  its 
upstream  neighbors,  who  in  turn  pass  that  information  on  to  their 
upstream  neighbors.  In  this  way ,  the  source  keeps  estimates  at  all 
times  of  the  link  flows  and  can  carry  out  the  gradient  projection 
iteration  autonomously  based  on  those  estimates.  The  first 
algorithm  based  on  this  idea  was  due  to  Gallager  [16],  who  posed 
an  alternative  formulation  to  (2.2),  where  the  unknowns  are  the 
fractions  of  flow  routed  to  each  outgoing  link  at  each  node,  rather 


Fig.  4.  Queueing  mode!  of  a  node  with  one  incoming  link  and  lwo  outgoing 

links. 


than  the  path  flows.  Tsitsiklis  and  Bertsekas  [42]  showed  the 
convergence  of  the  distributed  asynchronous  implementation  of 
gradient  projection  optimal  routing  algorithms  provided  the  time 
between  consecutive  broadcasts  is  small  enough  relative  to  the 
speed  at  which  the  flows  generated  by  the  algorithm  change.  The 
approach  for  showing  the  stability  of  this  algorithm  is  very 
different  from  the  proof  of  convergence  of  the  distributed 
Bellman-Ford  algorithm  where  the  monotonicity  of  the  dynamic 
programming  mapping  implied  that  the  est.  ates  are  closer  and 
closer  to  the  solution  regardless  of  the  actual  sequence  of 
communication  and  computation  times.  The  idea  here  is  that  if  the 
step  size  of  the  algorithm  is  small  enough,  then  the  flows  change 
so  slowly  with  respect  to  the  periods  between  communication 
times  that  their  evolution  is  very  close  to  that  of  the  centralized 
algorithm  which  uses  the  unique,  true  value  of  each  link  flow 

B.  Dynamic  Routing 

As  mentioned  earlier,  there  are  two  fundamentally  different 
philosophies  to  network  routing:  either  viewing  it  as  a  “flow” 
problem  in  which  the  traffic  of  messages  is  modeled  as  a 
“macro”-commodity  entering  the  network  as  a  single  entity 
(static  or  quasi-static  routing),  or  as  an  individualized-message 
path-finding  problem  in  which  the  traffic  is  broken  down  to  its 
constituent  elementary  units  (dynamic  routing)— a  dichotomy  akin 
to  that  of  statistical/quantum  mechanics  in  physics.  Whereas  the 
first  approach  leads  to  optimization  problems  where  time  plays  no 
role,  the  essential  ingredient  of  the  second  approach  is  the 
randomness  of  the  time-evolution  of  the  buffers  in  the  network, 
thus  placing  dynamic  routing  within  the  sphere  of  stochastic 
control. 

•  The  most  elementary  instance  of  dynamic  routing  is  the  simple 
queueing  system  shown  in  Fig.  4  which  models  a  node  with  one 
incoming  link  and  two  outgoing  links.  It  simplifies  considerably 
the  dynamics  of  the  message  arrival  process  and  of  the  service 
time  characteristics  and  ignores  processing  delay.  Thus,  the 
arrival  instants  of  messages  over  the  incoming  link  are  assumed  to 
constitute  a  Poisson  process  of  constant  rate  X.  Upon  arrival  each 
message  is  put  in  the  buffer  of  one  of  the  two  outgoing  links  This 
action  represents  the  “control."  The  buffers  are  assumed  to  have 
unlimited  (infinite)  capacity  and  the  message  lengths  are  assumed 
to  be  random  with  exponential  distribution  (an  obvious  additional 
simplification)  with  parameter  p.  The  two  outgoing  links  have 
equal  capacity  of  C  bits/s.  Thus,  each  link  is  modeled  as  a 
queueing  system  with  exponential  service  time  distribution  with 
parameter  pC.  It  is  desired  to  characterize  the  optimal  control 
policy  that  minimizes  the  average  total  delay  per  message  based 
on  the  observations  of  the  “state”  of  the  system,  namely  the 
number  of  messages  and  <72  in  the  two  buffers.  The  model,  of 
course,  assumes  that  the  head-of-the  line  message  is  dropped  from 
the  buffer  as  soon  as  the  transmission  of  its  last  bit  is  completed 

This  model,  despite  its  simplicity,  proved  to  be  rather  difficult 
to  analyze.  For  details,  see  [10],  it  is  not  important  to  repeat  them 
here.  It  should  suffice  to  state  that  the  main  result,  which  simply 
requires  that  upon  arrival  a  message  should  join  the  shortest  queue 
(with  arbitrary  decision  in  case  the  two  queues  have  equal 
numbers  of  messages),  was  hardly  surprising  Yet  an  intricate 
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irgument  on  tne  dynamic  programming  equation  (DPE)  was 
ieeaed  and  there  were  some  counter-intuitive  side-results  includ- 
ng  the  relaxation  of  the  Poisson  assumption  on  the  arrivals,  and 
he  fact  that  in  the  incomolete  state  information  case,  the 
eriaintv-equivalent  control  (i.e.,  send  the  message  to  the 
■xoected  shortest  queue)  need  not  be  optimum  unless  both  queues 
lave  the  same  number  of  customers  initially. 

he  ootimality  of  the  send-to-shortest-queue  (SS)  policy  in  the 
omDiete  state  information  case  can  be  proved  in  a  rather  strong 
-ense.  At  all  times,  the  sum  (q2  +  q2)  and  maximum  (max  {<7j, 
i2i)  of  the  number  of  messages  in  both  buffers  are  stochastically 
Minimized  bv  the  SS  policy  in  the  sense  of  the  partial  order 
letween  random  variables  acco-  og  to  which  the  random  variable 
'»  is  stochastically  smaller  than  .  if  P[X  <  a]  >  P\Y  £  a]  for 
ul  a.  The  Droof  of  optimality  can  be  obtained  by  the  method  of 
orward  induction  [53],  whereby  the  desired  stochastic  ordering 
letween  the  aueue  sizes  under  the  optimum  and  an  arbitrary 
loiicy  is  shown  to  be  preserved  at  each  transition. 

he  oroblem  formulation  of  [10]  is  one  of  many  related  ones 
see  [8],  [9],  [22],  [24],  [33],  [38],  [54],  [55])  which  are  slightly 
nore  complicated  but  share  some  fundamental  characteristics 
vnich.  in  fact,  extend  beyond  the  confines  of  the  routing  problem 
mo  the  areas  of  oriority  assignment,  resource  allocation,  and  flow 
ontroi.  Thev  are  all  Markovian  decision  process  (MDP)  prob- 
ems.  In  the  seauel  we  will  describe  a  fairly  general  MDP  that 
nciudes  the  dvnamic  routing  problem  as  a  special  case.  In  fact,  it 
nciudes  almost  all  of  the  aueueing  control  problems  that  have 
>een  studied  in  connection  with  communication  network  issues. 
Ve  will  then  outline  the  solution  methodologies  that  have  been 
iseo.  These  include  basically:  1)  the  derivation  of  optimality 
■onaitions  fiom  the  DPE  associated  with  the  corresponding  MDP; 
!)  the  use  of  sample  path  stochastic  dominance  arguments,  and 
inallv;  3)  the  reformulation  of  the  MDP  as  a  linear  program.  We 
mould  emohasize,  lest  the  reader  be  unduly  encouraged,  that  the 
iroolems  in  this  area  are  sufficiently  complex,  so  that  only  modest 
esuits  can  be  generally  obtained  despite  involved  arguments  and 
lontrivial  machinery.  Typically,  these  results  characterize  some 
tructural  oroperties  of  the  optimal  policy.  However,  knowledge 
u  such  structure  is  often  sufficient  to  permit  close  approximation 
>t  the  actual  ootimal  policy  by  well-founded  heuristics. 

et  us  recall  briefly  what  an  MDP  is  (for  details,  see  [30]).  We 
ieeti  a  state  descriotion  of  the  process  to  be  controlled .  Let  S  be  its 
;tate  space.  When  in  state  s  E  S,  a  set  A,  of  admissible  control 
ictions  is  specified.  When  action  a  E  A,  is  applied,  there  is  a 
ransition  from  state  s  to  s'  tha.  is  governed  by  the  probability 
'istribution  o(s'  |s,  a),  and  which  occurs  after  a  random  time  t 
vnich  is  exponentially  distributed  with  distribution  denoted  by 
(t|s,  a,  s').  Clearly,  p  and  t  together  describe  the  stochastic 
■vnamics  of  the  process  to  be  controlled.  Finally,  each  transition 
s  accompanied  by  a  cost  penalty  that  we  denote  by  c(t,  s,  a,  s’). 

he  dvnamic  routing  problem  we  considered  before  fits  in  this 
ormulation  easilv.  In  that  case,  the  state  space  is  S  -  {0,  1 ,  2,  3, 
•  •  '2.  An  element  s  =  (qt,  q2)  E  S  is  simply  the  pair  of  values 
u  the  respective  queue  sizes.  The  set  of  actions  A,  is  the  same  for 
=nv  state  and  consists  of  ax  and  a2  where  at  is  the  action  that 
issigns  an  arriving  message  to  the  buffer  of  link  /.  The 
•istribution  d  is  of  trivial  form,  in  that  the  transitions  are 
leterministic  Assignment  of  an  arrival  to  queue  /  augments  q,  by 
me  Note,  now,  that  in  addition  to  the  arrival  instants,  the 
ieoarture  (or  service  completion)  instants  are  important  because 
hev  induce  state  transitions  as  well.  A  departure  from  queue  / 
eauces  a,  by  one  When  a  departure  occurs  there  is  no 
Meaningful  control  action  that  can  be  applied  in  this  particular 
iroolem  The  exponential  distribution  t  corresponds  to  times 
letween  arrivals  and/or  departures.2  Finally,  the  cost  rate  c  must 

\  slight  modificauon  of  lhe  mode:!  of  transitions,  called  umformizaiion,  is 
iserul  in  lhai  u  introduces  dummy  iransilions  from  a  state  into  itself,  thus, 
-ome  situations  which  introduce  nonessenlia!  complications  can  be  handled 
vuhoui  departure  from  this  discreie  transition  time  formulation. 


reflect  the  delay.  By  Little’s  result  in  queueing  theory,  we  know 
that  the  average  delay  is  proportional  to  the  average  number  of 
customers  in  the  queue.  Thus,  c(t,  a,  s,  s')  can  be  taken  to  be 
simply  equal  to  ( qt  +  q2).  This  MDP  formulation  can  be  extended 
to  encompass  more  complicated  queueing  control  problems. 

Let  us  return  now  to  the  general  MDP.  We  need  to  specify  the 
notion  of  a  control  policy  and  the  optimization  criterion.  Let  us 
denote  by  |i,  |2,  •  •  • ,  the  state  transitions  that  occur  at  instants  fi, 
t2,  •••.A  policy  x  is  a  sequence  of  decision  rules  xi,  x2,  •  •  •, 
where  x„  determines  the  choice  of  action  at  the  transifion  time 
It  can  be  viewed  as  a  conditional  distribution  on  the  set  of  actions 
parametrized  by  the  past  history  of  the  process. 

The  optimization  criterion  that  corresponds  to  the  practical  case 
of  expected  total  delay  is  the  long-run  average  expected  cost; 
namely,  if  we  denote  by  V( x,  /,  t)  the  expected  cost  incurred 
under  policy  x,  with  initial  state  i,  until  time  t  we  consider  as  the 
optimization  criterion  the  value  function 

K(x,  /)  A  lim  inf  —  0  . 

/“•CO  t 

For  technical  reasons,  however,  that  are  well  known  to  optimiza¬ 
tion  specialists,  it  is  easier  to  establish  optimality  conditions  if  we 
consider,  instead,  the  so-called  a-discounted  cost,  i.e., 

^“(x,  /)=  f  e~al  dV(x,  i,  /). 

J/-0 

The  latter  converges  to  the  former  as  a  -*  1  under  a  variety  of 
stationarity  conditions.  For  technical  reasons  that  will  become 
apparent  in  the  sequel,  we  will  also  consider  the  finite-horizon 
costs.  These  are  defined  in  a  similar  fashion  except  that  we  let 
time  extend  only  to  /„,  the  instant  of  the  /ith  transition.  If  we 
denote  by  Va(i)  andK(/)  (and  also  V*(i),  V„(i)  for  the  finite 
horizon  cases)  the  values  of  these  cost  functions  when  x  is  chosen 
optimally,  we  are  led  to  the  following  DPE: 

l/0(/)=  ir.f  £  [c(t,  a,  i')  +  f}(i,  a,  i')Va(i'))p(i' \a,  i) 

"GA,  .. 

or 

l/“+i(0=  inf  ^  a>  '')  +  /S(t,  a,  i')V°(i')]p(i'\a,  i) 

aSAl  ,. 

where 

P(s,a,s')A  I  e_ar  dt{r\s,  a,  s') 

Jo 

and 

500 

c(t,  s,  a,  s')  dt  (t|s,  a,  s') 

0 

are  the  discount  factor  and  cost  values  per  transition,  respectively. 

The  DPE  is  of  fundamental  importance  in  the  study  of  MDP’s 
because  the  value  function  Va  has  the  usually  convenient 
properties  of  convexity,  supermodularity,  and  other  forms  of 
monotonicity  that  lead  readily  to  sufficient  conditions  for  optimal¬ 
ity.  The  difficulty  with  the  analysis  of  the  DPE  is  that  the 
optimality  conditions  are  heavily  problem-dependent  and  often 
lead  to  explosively  large  numbers  of  cases  to  be  verified 
-.eoarately.  This  is  especially  true  for  MDP’s  that  arise  from 
jueuemg  models.  For  this  reason,  and  because  of  additional 
difficulties  that  arise  when  the  state  is  on  the  boundaries  (see 
[22]),  it  became  evident  that  alternative  methods  of  solution  were 
needed. 
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One  alternative  method  that  has  received  attention  recently  and 
which  produced  successful  results  in  problems  of  queueing  control 
(akin  to  the  routing  problem)  is  a  probabilistic  method  called 
sample-path  or  stochastic  dominance.  This  method  bypasses 
completely  dealing  with  the  value  function.  Instead,  it  focuses 
directly  on  seeking  the  optimal  policy.  Let  G  be  the  class  of 
admissible  policies.  If  we  suspect  that  the  optimal  policy  r  has  a 
property  p,  then  we  can  proceed  as  follows  in  order  to  prove  that 
it  actually  does  have  that  property.  Let  S  be  a  subset  of  G,  to 
which  we  know  tne  optimal  policy  belongs.  We  consider  a  subset 
of  policies  Sp  C  S,  all  elements  of  which  have  the  property  p.  For 
every  ir  €  Sp,  we  attempt  to  construct  a  policy  i  which 
outperforms  x.  If  we  succeed,  we  must  conclude  that  the  optimal 
policy  belongs  to  Sp.  In  constructing  i  we  often  need  to  engage  in 
a  careful  reorganization  of  the  underlying  probability  space  in 
order  to  align  the  sample  paths  properly,  so  that  the  comparison  of 
the  two  policies  can  be  made  for  every  sample  path.  This 
procedure  is  full  of  risks  and  extreme  care  is  required  to  avoid 
faulty  arguments.  Note,  also,  that  to  apply  this  method  usefully, 
we  must  have  “guessed"  the  properties  of  the  optimal  policy 
correctly.  Thus,  at  best,  it  is  a  method  to  >  <-nfy  the  validity  of  our 
conclusions,  rather  than  a  method  tha,  levels  us  to  the  right 
conclusions. 

Successful  use  of  the  stochastic  dominance  approach  was  made 
in  [52]  and  [50]  where  a  problem  that  is  dual  *o  the  problem  of 
dynamic  routing  was  studied.  Specifically,  in  a  two-server 
queueing  system  in  which  the  two  servers  have  unequal  service 
rates,  we  wish  to  determine  whether  and  when  the  slower  server 
needs  to  be  activated  if  we  are  interested  in  minimizing  the  usual 
total  expected  delay  function.  That  the  optimal  policy  has  a 
threshold  form  (.namely  that  the  slower  server  must  be  activated 
when  the  queue  size  exceeds  a  crucial  value)  was  proven  in  [29] 
via  the  DPE  method.  However,  the  alternative  proof  via  the 
arguments  of  stochastic  dominance  was  much  simpler  and  led  to  a 
generalization  of  the  result  to  cases  of  nonexponential  arrivals 
and/or  service,  that  could  not  have  been  easily  accomplished  by 
means  of  the  DPE  method. 

Another  successful  use  of  the  stochastic  dominance  method  has 
been  noied  in  [2|.  In  this  case  the  problem  of  optimally  choosing 
which  customer  to  serve  next  in  a  single  queueing  system  was 
considered  under  the  constraint  that  each  customer  must  begin  (or 
terminate/  service  by  an  individually  assigned  random  deadline  or 
else  it  is  dropped  from  the  system.  The  cost  criteiion  is  then  to 
minimize  the  expected  number  of  lost  customers.  It  was  proven 
that  scheduling  the  customer  with  shortest  time  to  extinction 
minimizes  this  cost. 

Although  these  problems  differ  from  routing,  the  model 
structures  are  quite  similar,  and  it  has  been  observed  that,  usually, 
queueing  control  problems  with  such  structural  similarities  can  be 
studied  equally  successfully. 

The  third  method,  which  was  first  used  in  [38]  in  the  study  of  a 
specific  queueing  dOntrol  problem,  and  which  has  been  broadly 
extended  recently  in  [51],  is  the  linear  programming  approach 
Almost  any  queueing  control  problem  that  can  be  formulated  as  a 
MDP  (therefore  the  problem  of  dynamic  routing,  as  well)  can  be 
converted  to  an  equivalent  linear  program  (LP).  The  advantages 
of  this  conversion  are  that  it  is  problem-independent  and  it  leads 
occasionally  to  successful  study  of  semi-Markov  decision  prob¬ 
lems  as  well.  Furthermore,  it  facilitates  considerably  the  charac¬ 
terization  of  optimal  solution  properties.  Here  is  how  this 
equivalence  can  be  demonstrated. 

Let  us  concentrate  on  an  MDP  under  a  finite-horizon,  dis¬ 
counted  cost  formulation.3  We  shall  consider  a  queueing  model 
with  state  dynamics  given  by 

Xk  *  1  ”  Xk  +  {*  *  |  Zk  ,  i . 

*  The  reason  thai  we  cannoi  work  direcily  wiih  infinite  horizons  is  ihe 
possibility  of  so-called  duality  gaps  in  linear  programming  lheory  with 
infinite-dimensional  variables. 


Here,  xk  denotes  the  state  at  tk  (the  instant  of  the  k  th  transition),  (k 
represents  that  transition,  and  zk  represents  the  control  action  at 
that  transition.  The  transition  i-k  can  represent  an  arrival  or  a 
departure  as  an  increment  of  the  state.  The  control  zk  is 
conveniently  defined  to  enable  [zk  =  1)  or  disable  (z*  =  0)  a 
transition.  For  example,  in  the  routing  model  discussed  at  the 
beginning  of  the  section,  the  su»e  is  equal  to  a  two-dimensional 
vector  of  queue  sizes,  and  the  transition  corresponding  to  sending 
an  arriving  message  to  the  first  queue  would  be  represented  by  £* 
=  [1  0]r.  Indeed,  a  variety  of  queueing  control  problems  (in  fact, 
the  vast  majority  of  those  that  have  been  considered  in  connection 
with  communication  network  problems)  can  be  so  represented 

Note  that  the  crucial  aspect  of  this  state  equation  is  the  linear 
dependence  on  the  controls.  Note  also  that  usually  the  cost 
function  is  linear  in  the  state  (since  the  usual  cost  criterion  is  the 
expected  delay  which  is  coupled  to  the  queue  sizes,  and  hence  the 
state,  by  Little’s  result).  Consequently,  the  cost  is  linear  in  the 
controls.  The  minimization  of  the  cost  over  the  set  of  control 
trajectories  is  constrained  since  the  state  equation  must  be  satisfied 
and  the  state  must  always  belong  to  an  admissible  set  (typically,  a 
set  of  vectors  with  integer-valued  coordinates  belonging  to  given 
ranges).  Thus,  the  constraints  are  also  linear  in  the  controls,  and 
the  problem  is  easily  formulated  as  an  LP.  There  are,  however, 
two  points  that  require  attention.  First,  the  controls  are  integer 
valued,  i.e.,  zk  €  {0,  1}.  Second,  in  the  MDP  the  vectors  i-k  are 
random  and  depend  on  past  history. 

The  first  problem  is  taken  care  of  in  one  of  two  ways-  by 
construction  or  by  use  of  a  property  of  the  constraint  matrix  of  the 
linear  program,  called  unimodularity  The  construction  method 
involves  using  a  noninteger  optimum  control  whose  quantized 
version  satisfies  the  MDP  optimality  conditions  (see  [38],  [51]  for 
details).  The  use  of  unimodularity  involves  a  well  known  result  in 
the  theory  of  integer  linear  programming  (eg.,  [34])-  if  the 
constraint  matrix  of  an  LP  is  integer  valued  and  totally  unimodu 
lar  (i.e.,  each  of  its  sub-determinants  is  +1,  -  1,  or  0),  then  all 
the  vertices  of  the  feasible  polytope  are  integer  valued  Therefore , 
no  further  restrictions  are  needed  to  guarantee  that  the  solution  of 
a  conventional  LP  will  result  in  the  integer  valued  optimal 
control.  Fortunately,  in  many  queueing  problems  of  interest 
(including  the  dynamic  routing  problem),  the  constraint  matrix  is 
indeed  totally  unimodular. 

The  second  problem  is  easily  taken  care  of  by  thinking  of  the 
Zk’ s  as  functions  from  the  sample  space  ft  to  the  action  space 
Thus,  the  cost  criterion  can  be  written  as  a  functional  on  the 
underlying  probability  space. 

Let  z*(w*)  represent  the  control  action  at  the  Arth  transition, 
where  w*  denotes  the  random  “histcry"  until  the  Arth  transition 
We  have 

**  +  i(w*+i)  ==•**(<<>*) +  ?*»i(«*+i)?*+i(y*+i). 

Let  S  and  2  be  the  set  of  admissible  states  and  controls, 
respectively.  The  ^-discounted,  n-step,  expected  cost  under 
policy  z  and  initial  condition  x  is  given  by 

JBn{x,z)  =  El  £  0 kL(zk) 

*- 0 

where 

L(Zk)  =  cTxk  +  dTzk 

(c  and  d  denote  constant  column  vectors).  This  is  a  cost  function 
that  is  adequately  general.  For  example,  in  a  pure  resource 
allocation  problem  without  blocking  or  rejection  of  messages  we 
have  d  =  0,  while  in  pure  blocking  problems  we  take  c  =  0.  The 
slate  equation,  after  repeated  iterations,  yields 

* 

xk(o>k)=x+£  Zj(o>j)(/o>j),  k>0. 

l 
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Therefore, 

JBn{x,z)-Ex  Y  (3*  \cTx+cTY  Zjtj+dTzA 
*»0  v.  /*  1  J 

ctx+Ex  Y  p*  [s  • 

But 

Ex(Zk)=Y  z^k)Pr(uk)- 

Hence 

Js„(x,  z)  =  -rzjcTx+  S  S  7 kMZkM 

p  k.  i 

where  7*(<j*)  is  a  known  function  that  depends  on  Pr(wk),  c,  £*, 
and  /3*.  Consequently,  the  MDP  is  equivalent  to 

n 

min  S  S  7 *(<u*)Z*(<u*) 

£«1 

subject  to 

(jf+S  e  5 

which  is  a  conventional  LP  where  the  initial  condition  plays  the 
role  of  a  parameter,  the  sensitivity  with  respect  to  which  can  be 
studied  by  the  well  developed  theory  of  sensitivity  analysis  of 
linear  programming  [15]. 

In  conclusion,  we  see  that  the  MDP  is  converted  to  an 
equivalent  LP  under  very  mild  conditions  that  are  usually  satisfied 
by  dynamic  routing  and  other  queueing  control  problems.  Thus,  a 
third  alternative  methodology  becomes  gcner.dly  available  for  the 
study  of  these  problems.  Whether  to  choos-  from  the  arsenal  the 
DPE  approach,  or  the  LP  method,  or  stochas  jc  dominance  tools, 
depends  on  the  problem  and  on  the,  as  yet  undeveloped,  intuition 
that  the  investigator  should  possess. 

ill.  Multiple-Access  Communications 

"  he  communication  networks  considered  in  the  discussion  of 
outing  problem^  in  Section  II  consist  typically  of  a  set  of  nodes 
onncctcu  by  point  to  point  communication  links.  Each  of  these 
inks  viewed  in  isolation  can  be  modeled  as  a  classical  communi 
anon  channel  with  one  sender  and  one  receiver.  In  this  section, 
ve  consider  multipoint  to  point  communication  links  where  sev 
cral  transmitters  share  a  common  channel.  Multiple  access 
channels  are  the  basic  building  blocks  of  radio  networks,  satellite 
communication,  and  local  area  networks,  and  during  the  last  15 
years  have  attracted  the  attention  of  many  communication, 
information,  and  control  theorists. 

There  is  a  wide  variety  of  strategies  to  divide  the  “resou. 
of  a  communication  channel  among  several  geographically  .... 
*rscu  transmitters.  The  simplest  methods  are  those  that  assign  a 
icrmaneni  independent  sub  .hanncl  to  each  transmitter  (e.g.,  in 
reauency  division  multiple  access  and  time  division  multiple 
•■ccessj;  these  strategies  are  easy  to  analyze  and  are  widely  used  in 
iracuce  in  situations  where  the  users  need  to  transmit  at  fairly 
teaav  rates.  If  the  transmitters  are  bursty  (i.e.,  the  radio  of  pook¬ 
as  average  rate  at  wtiich  the  need  to  transmit  is  high)  those  static 
netnods  are  inefficient  since  most  of  the  time  the  channel  is 
incerutilized  while  demand  (and  induced  delay)  accumulates  at 


busy  terminal  locations.  Dynamic  channel  sharing  strategies 
overcome  this  problem  by  allocating  channel  resources  on  an  on- 
demand  basis.  Consistent  with  the  overall  spirit  of  this  paper,  our 
goal  here  is  not  to  review  this  vast  topic,  but  rather  to  demonstrate 
how  control  theory  can  play  a  useful  role  in  its  study.  Here  we 
wish  to  single  out  two  multiple  access  strategies:  random  access 
and  simultaneous  transmission,  which  are  broadly  representative 
of  dynamic  channel  sharing  systems  and  in  which  control  theoretic 
concepts  have  played  a  pivotal  role. 

In  random  access  communication,  the  conceptual  allocation 
model  is  addressed  without  an  effort  to  exploit  the  signaling 
degrees  of  freedom  and  the  micro-structure  of  the  transmitted 
messages.  For  this  purpose,  a  crude  channel  model  is  considered, 
that  achieves  this  separation  of  the  “macro”  from  the  “micro” 
problem.  In  simultaneous  transmission  systems,  however,  a  more 
refined  viewpoint  is  adopted,  by  taking  the  realities  of  the  medium 
into  account,  modeling  them,  and  exploiting  them. 

A.  Random-Access 

The  object  of  interest  here  is  the  so-called  collision  channel 
model ,  in  which  messages  (called  packets)  require  one  time  unit 
(called  slot)  for  transmission  and  are  sent  by  a  population  of  users 
who  are  synchronized  so  that  their  slots  coincide  at  the  receiver, 
but  are  otherwise  uncoordinated  and  unaware  of  which  and  how 
many  users  have  packets  to  transmit.  If  two  or  more  packets  are 
simultaneously  transmitted,  it  is  assumed  that  the  receiver  is 
unable  to  recover  any  of  the  messages,  and  they  have  to  be 
retransmitted  in  a  future  slot.  In  the  ALOHA  algorithm,  which 
was  developed  in  the  early  1970’s  [1]  at  the  University  of  Hawaii 
and  marked  the  beginning  of  the  area  of  random-access  communi¬ 
cation,  each  packet  that  has  been  unsuccessfully  transmitted 
before  is  transmitted  with  probability  p  in  the  next  slot.  New 
packets  which  have  not  attempted  transmission  before  are 
transmitted  with  probability  either  1  or  p  depending  on  which 
version  of  the  ALOHA  algorithm  is  used.  In  our  discussion,  we 
will  assume  the  latter  choice. 

Under  these  conditions,  and  assuming  that  the  number  of  newly 
generated  packets  in  each  slot  is  a  random  variable  (with  mean  k) 
independent  from  slot  to  slot,  the  number  of  packets  awaiting 
transmission  (called  backlog)  is  a  Markov  chain  taking  values  in 
{0,  1,  2,  }.  The  central  problem  is  to  investigate  under  what 

conditions  the  backlog  Markov  chain  is  ergodic,  i.e.,  it  is  stable  in 
the  sense  that  it  reaches  a  steady  state  in  which  the  periods 
between  the  times  when  there  are  no  packets  to  transmit  arc  not 
too  infrequent  (they  have  finite  expected  value).  The  transition 
probabilities  of  the  Markov  chain  are  parametrized  by  the  rate  of 
unvai  of  new  Dackets  X  and  the  retransmission  probability  p. 
Whereas  X  is  fixed  and  given,  p  is  chosen  by  the  transmitters. 
Hence,  we  arc  dealing  with  a  fairly  simple  controlled  Markov 
chain  whose  control  space  is  the  interval  (0,  l|.  In  the  original 
ALOHA  algorithm,  the  control  p  remained  constant  and  common 
to  all  transmitters  regardless  of  the  information  acquired  by 
listening  to  the  channel,  thereby  resulting  in  the  open-loop  conirol 
of  the  Markov  chain.  Depite  several  “proofs”  of  the  stabihtiy  ot 
ALOHA  published  during  the  1970's,  neither  the  actual  system 
built  in  Hawaii  nor  the  ideal  Markov  chain  model  were  stable.  The 
reason  why  the  open  loop  system  is  unstable  can  be  easily 
understood  by  considering  the  backlog  drift,  d(n),  which  is 
defined  as  the  expected  increase  in  the  backlog  over  the  next  slot 
when  the  current  value  of  the  backlog  is  equal  to  n.  It  is  easy  to  see 
that  the  backlog  drift  is  given  simply  by  the  expected  number  ot 
new  packets  per  slot  minus  the  expected  number  of  successfully 
transmitted  packets  in  the  next  slot,  i.e., 

d(n)  =  k-\np(\-p)"-1).  (3.1) 

The  drift  quantifies  the  expected  evolution  of  the  Markov  chain 
from  each  state,  and  therefore  it  is  a  valuable  tool  in  analyzing  ihe 
stability  of  the  chain.  For  any  p  £  (0,  1]  the  term  in  brackets  in 


EPHREM1DES  AND  VERDU:  COMMUNICATION  NETWORK  PROBLEMS 


937 


(3.1)  goes  to  0  as  n  -*  oo,  and  hence,  the  drift  is  positive  and  close 
to  X  for  sufficiently  large  backlogs.  This  implies  that  when  the 
backlog  is  large  it  tends  to  grow,  thereby  eliminating  any  hope  for 
stability.  Using  standard  results,  this  reasoning  can  be  formalized 
straightforwardly  to  prove  not  only  the  instability  of  the  open-loop 
system  [1 1 )  for  all  values  of  X  and  p,  but  the  fact  that  the  backlog 
goes  to  infinity  with  probability  one  [251,  [35],  [40J. 

Fortunately,  the  system  can  be  stabilized  by  closed-loop 
control.  Let  us  examine  first  the  case  of  complete-state  informa¬ 
tion,  i.e.,  each  station  is  informed  at  the  end  of  each  slot  of  the 
current  value  of  the  backlog  and  chooses  the  retransmission 
probability  on  the  basis  of  that  information.  As  far  as  stability  is 
concerned,  the  best  choice  of  the  retransmission  probability  p  is 
the  value  that  minimizes  the  drift  because  that  results  in  the 
maximum  possible  arrival  rate  that  guarantees  stability  (called  the 
throughput).  It  follows  from  (3.1)  that  the  optimum  value  of  p  is 

p*(/i)=i,  n=l,2,  •••  (3.2) 

n 

and  the  resulting  drift  is 


which  is  negative  for  n  >  1  when  X  <  e~ 1 ,  and  is  positive  for 
large  backlogs  when  X  >  e'K  Therefore,  the  throughput  of  the 
closed-loop  system  with  complete  state  information  is  e~l  = 
0.368.  However,  the  relevance  of  complete  state  information 
feedback  is  rather  limited  in  practice.  This  is  because  the 
instantaneous  value  of  the  backlog  is  available  to  each  station  only 
if  there  exists  so  large  a  degree  of  communication  among  the 
transmitters  that  much  more  efficient  algorithms  than  ALOHA 
can  be  used. 

The  case  of  partial  state  information  is  the  problem  of  interest  in 
practice,  since  the  only  feedback  available  to  each  station  is  the 
outcome  (collision,  success,  empty)  of  the  transmission  in  each 
slot.  The  analysis  of  the  controlled  system  with  partial  state 
information  was  pioneered  by  Hajek  and  Van  Loon  [201  who 
proposed  a  recursive  updating  law  of  the  retraasmission  probabili¬ 
ties  as  a  function  of  the  channel  outcomes.  This  feedback  policy 
was  shown  in  [21]  to  attain  the  throughput  achievable  with 
complete-state  information,  namely  e~l.  Those  papers  and  subse¬ 
quent  works  have  referred  to  the  problem  as  decentralized 
control  of  ALOHA,  motivated  by  the  fact  that  each  station 
chooses  the  retransmission  probability  autonomously  based  on  the 
channel  feedback.  However,  it  is  useful  to  recognize  that  the 
problem  boils  down  to  (centralized)  stochastic  control  with  one 
decision  maker  and  incomplete  state  information  because  all 
stations  arc  constrained  to  use  the  same  retransmission  probabili¬ 
ties. 

We  will  review  here  the  proof  of  stability  of  the  following 
certainty-equivalence  closed-loop  control: 


(3.4) 


where  n  is  an  estimate  of  the  backlog  updated  according  to 


dk*\- 


max  {1,  n*  +  a} 
d*  +  /3 


&th  slot  is  idle 

*th  slot  is  success  or  collision. 


nk)}k  (rather  than  the  backlog  itself)  which  is  a  Markov  process. 
According  to  (3.4)  and  (3.5)  the  drift  of  this  Markov  process  is 
given  by 

El(nktl,  nk+i)-(nk,  nk)\(nk,  nk)  =  (n,  s)] 

X-j  ,  /3+(max  {a,  l-s}-/3)  £l-~ 

=  (d(n,  s ),  c(n,  s)).  (3.6) 

Contrary  to  what  we  saw  in  the  case  when  the  state  is  known,  it  is 
not  true  that  the  backlog  drift  is  negative  for  sufficiently  large 
backlogs.  As  we  can  see  in  Fig.  5,  if  the  estimate  is  far  from  the 
true  value,  then  the  backlog  may  actually  tv  4  to  increase. 

However,  at  every  point  in  the  state  space  the  tendency  of  the 
process  is  to  approach  the  diagonal  where  the  estimate  is  equal  to 
the  true  value  of  the  backlog.  Furthermore,  as  Fig.  5  or  the 
analysis  of  the  perfect-state  information  case  shows,  the  drift 
along  the  diagonal  is  negative.  Such  a  behavior  is  a  strong 
indication  of  the  stability  of  the  controlled  Markov  process. 

This  can  be  proved  using  a  powerful  sufficient  condition  found 
by  Mikhailov  [31]  for  the  stability  of  a  Markov  process  taking 
values  in  (8*  x  18* .  In  essence,  Mikhailov’s  condition  states  that 
it  is  enough  to  restrict  attention  to  those  points  of  the  state  space 
where  either  the  backlog  or  its  estimate  arc  large  and  at  which  the 
drift  is  radial,  i.e., 


d{n,  s )  n 
c(n,  s )  s  ’ 

then,  it  is  sufficient  for  stability  that  the  drift  point  towards  the 
origin  at  those  states.  To  see  that  this  condition  is  indeed  satisfied 
for  our  system,  we  compute  first  the  asymptotic  drifts  along  the 
radius  {(«,  s ):  n/s  =  for  $  E  [0,  <») 

d(t)=  lim  d(fa,  s)  =  \  —  (3.7a) 


(3.5) 

The  throughput  attainable  with  this  feedback  law  depends  on  the 
constants  a  <  0  and  j3  >  0.  As  we  will  see,  there  exists  a  set  of 
choices  for  those  constants  that  results  in  throughput  equal  to  e~ '. 

Unlike  the  case  of  complete-state  information,  the  proof  of 
stability  is  not  straightforward  because  now  it  is  the  two- 
dimensional  process  formed  by  the  backlog  and  its  estimate  {(«*, 


c(^)=lim  c($s,  s)=ff  +  (ct-l 3)e~*.  (3.7b) 

J- W 

It  can  be  checked  using  (3.7)  that  if  the  constants  a  and  /?  in  (3  5) 
are  chosen  such  that  (S  >  0.23XandX  -  e'i  =  0  +  (a  -  /3)e~ 
then  the  drift  is  radial  only  at  ^  =  1  (cf.  Fig.  5),  where  it  points 
towards  the  origin  as  long  as  r/(l)  =  X  -  e"1  <  0. 
Mikhailov’s  sufficient  condition  can  be  justified  constructing  a 


IEEE  TRANSACTIONS  ON  AUTOMATIC  CONTROL.  VOI.  34.  NO  9.  SFPTEMBER  1989 


938 

stochastic  Lyapunov  function  to  prove  the  stability  of  a  Markov 
process  {xk}k  with  state  space  !fl4  X  Ifl4.  To  that  end,  it  is 
advantageous  to  switch  to  polar  coordinates  ( r ,  <j>)  and  to  define 
the  radial  drift  <5(r,  <£)  as  the  projection  of  the  drift  along  the 
direction  of  the  point  ( r ,  <$>)  and  the  tangential  drift  [i(r,  <t>)  as  the 
projection  of  the  drift  along  the  direction  perpendicular  to  (r,  <j>). 
Denote  the  asymptotic  drifts  8(<f>)  -  Hm,  .„  8(r,  <f>)  and  p.(<j>)  = 
lim,-,*  n(r,  <j>)  and  define  the  function 


V(r,  <f>)=r4>(<f,) 


<f>(<£)=exp  C  ^  m(u)  duj  <f>  €  [O,  j  j  . 


Note  that  V(r,  <$>)  is  a  candidate  Lyapunov  function  because  it  is 
positive  outside  the  origin  and  V(r,  <£)  -*  oo  as  r  -*  oo. 
Furthermore,  it  can  be  shown  (31]  that  the  asymptotic  drift  of  the 
candidate  Lyapunov  function  is  equal  to 


lim  E{V(xk+l)-  V(xk)\ xk={r,  <5)]  =  4>(<M[S(<«-£V(<«1.  (3.8) 

f~  O* 


Now,  under  Mikhailov’s  condition,  the  asymptotic  drifts  arc 
assumed  continuous  on  [0,  x/2]  and  8(<t>)  <  e  for  any  phase  such 
that  =  0  (i.e.,  whenever  the  drift  is  radial  it  points  towards 
the  origin),  therefore,  the  constant  C  can  be  chosen  large  enough 
so  that  the  left  side  of  (3.8)  is  upper  bounded  by  a  negative 
constant.  This  implies  that  V(r,  <j>)  is  indeed  a  stochastic 
Lyapunov  function  and  therefore  standard  results  on  the  stability 
of  stochastic  systems  [27],  [45]  can  be  applied  to  show  thie 
stability  of  the  system.4 

In  some  multiaccess  environments,  the  receiver  can  indeed 
demodulate  reliably  one  or  more  packets  even  in  the  presence  of 
other  interfering  packets  and  the  collision  channel  model  no 
longer  applies  to  those  cases.  The  results  reviewed  in  this  section 
can  be  generalized  to  a  general  channel  with  midtipacket 
reception  capability,  to  show  that:  1)  the  throughput  of  open-loop 
ALOHA  is  equal  to  the  limit  of  the  expected  number  of 
successfully  received  packets  per  slot  as  the  backlog  goes  to 
infinity  [17];  and  2)  the  throughput  of  dosed -loop  ALOHA  (with 
e  itlicr  complete  or  partial  state  information)  is  equal  to  the 
maximum  over  v  of  the  expected  number  of  successfully  received* 
packets  per  slot  when  the  number  of  attempted  transmissions  is  a 
Poisson  random  variable  witii  mean  v  [18]. 

Returning  to  the  case  of  the  collision  channel,  the  next  natural 
step  is  to  drop  the  main  restriction  in  the  ALOHA  algorithm, 
namely,  that  all  stations  use  the  same  retransmission  probability. 
This  is  done  in  a  class  of  random-access  algorithms  referred  to  as 
collision  resolution  algorithms  which  are  characterized  by  the  fact 
that  not  only  are  all  blocked  packets  eventually  retransmitted 
successfully,  but  all  users  eventually  become  aware  that  these 
packets  have  been  successfully  retransmitted.  Contrary  to  the 
ALOHA  algorithm,  the  decision  whether  or  not  to  transmit  a 
packet  takes  into  account  the  previous  history  of  attempted 
retransmissions  of  that  particular  packet.  The  introduction  of  this 
new  dimension  into  die  problem  renders  Markov  chain  tools 
considerably  less  useful  than  in  (he  foregoing  analysis  and 
converts  it  into  a  very  difficult  decentralized  stochastic  control 
problem,  for  which  die  optimum  throughput  remains  unknown5 
despite  many  efforts. 


4  Another  choice  of  stochastic  Lyapunov  function  for  the  specific  case  of 
decentralized  control  of  ALOHA  can  be  found  in  (43]. 

5  The  best  known  algorithm  has  been  shown  to  achieve  a  throughput  of 
0.488  using  Howard’s  policy  iteration  for  sequential  infinite -horizon  problems 
(32)  or  by  reduction  to  a  simple  optimization  problem  (48J.  On  the  other  hand, 
it  is  known  that  the  optimum  throughput  is  upper  bounded  by  0.568  (44], 


B.  Simultaneous  Transmission 

In  contrast  to  random-access  communication  systems,  in 
simultaneous  transmission  multiple-access  systems,  the  transmit¬ 
ters  send  their  messages  simultaneously,  independently,  and 
without  monitoring  the  channel  in  any  way.  The  most  common 
type  of  simultaneous  transmission  system  is  code-division  multi¬ 
plexing,  where  each  user  modulates  a  preassigned  signature 
waveform  known  by  the  receiver. 

Specifically,  we  will  assume  that  in  order  to  send  the  message 
{&*(/)  £  A  (i.e.,  a  string  of  Af  symbols  drawn  from  a  finite 
set  A ),  the  &th  user  transmits 

M- 1 

2  bk(i)sk(t~iT ) 

i  =  0 

where  {$*(/),  0  <  /  <  T)  is  the  waveform  assigned  to  the  Ath 
user,  and  T  is  the  symbol  period.  Then  the  demodulator  receives 
the  sum  of  the  signals  transmitted  by  the  K  active  users  embedded 
in  noise 

K  M —  1 

r(0=2  2  bk(i)sk(t-iT-rk)  +  «(/)  (3.8) 

*-l  i-0 

where  the  offsets  rk.,  <rtG  [0,  7’)  model  the  fact  that  the  users 
do  not  synchronize  their  transmissions.  Then  the  task  of  the 
receiver  is  to  recover  the  transmitted  information  strings 
{WOJJlo1*-!*  Flowing  [47]  we  will  show  how  to  obtain  an 
optimum  muniuser  demodulator  via  dynamic  programming.  First, 
denote  the  MK -vector 

d={dkkUc=bk(i),  *=1,  •••,  A,  i=0,  ••*,  Af-  1} 
and  the  multiuser  signal  in  (3.8) 

K  M- 1  MK 

S(t,  d)=2  X  bk(i)sk(t - rr- rk)  =  2  d,z,{t)  (3.9) 

*-i  /- o  I.  i 

where  z»+it{f)  =  sk(t  -  iT  -  t>). 

A  reasonable  criterion  for  demodulating  the  information  carried 
in  S(f,  rf)  upon  observation  of  r{t)  is  to  select  the  Af  A- vector  d 
that  best  explains  the  received  waveform  in  the  sense  of 
minimizing  the  energy  of  the  corresponding  noise  realization, 
i.e., 

min  |S(f,  d)-r(t)\l.  (3.10) 

If  the  noise  n(t)  is  while  and  Gaussian,  then  this  criterion  results 
in  maximum  likelihood  decisions.  Equivalently,  the  objective  is  to 
find  the  vector  that  solves 

max  Q(tf)  (3.11) 

where 

Q(</)  =  2  (°*  S(t,  d)r(t)  dt-  P  S\t,  d)  dt.  (3.12) 

Since  the  maximization  in  (3. 1 1)  is  over  a  finite  set,  we  could  solve 
it  by  the  brute-force  method  of  evaluating  Q (d)  for  each  possible 
argument.  However,  it  is  possible  to  decompose  0(d)  in  a 
sequential  fashion  that  lends  itself  to  efficient  optimization.  From 
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M(0)  ,  bid)  ^  b|(2),  b|(3)  b|(4) 

’  T  “  7t  iO1  i? 

.40)  byd).  bj(2)  bj[3)  t^4) 

~  ^  — *T  S'  if  (4* 


where  gk(m)  =  R( k  +  K,  k  +  m).  Putting  together  (3.12), 
13.13),  and  (3.16)  we  see  that  we  can  express  U(d)  as  a  sum  of 
vfK  terms,  each  of  which  depends  on  K  components  of  d  and  such 
that  consecutive  terms  depend  on  the  same  components  but  one. 
Specifically,  we  can  write 


MO),  bjd)  bj(2)  b3(3)  _bj(42 

g>  9’  I21  15’ 

;iR.  6.  Symbol  epochs  for  K  =  3  and  M  =  5. 

3.9)  it  is  immediate  to  write  the  first  integral  in  (3.12) 
eauentially 


MK 

nw)=5  \(xhdj) 

i- 1 


(3.17) 


wnere 


vnere 


__  .  d)r(t)  dt  =  Yi  d,y> 

.  «  I 

Sj~\  Zj(t)r(t)dt. 

J  —  Co 


\.(x,  u)=u[iyJ+uw^t)-2xTgt{,)\  (3.18) 

(3.13)  and  Xj  is  tiie  state  of  a  shift-register  K  -  1  dimensional  system 

---r+1  =  [*,+i(0.  * "» ■*)+,(^- 1)]  =  [*,(2),  •••,*,(*■- D.d,]; 

xo  =  0.  (3.19) 


(3.14) 


This  implies  that  the  objective  function  (3.12)  depends  on  r(t) 
only  through  the  quantities  which  are  obtained  by 

correlating  r(f)  with  each  of  the  signature  waveforms  during  each 
symbol  epoch.  In  order  to  find  an  explicit  expression  for  the 
second  integral  on  the  right-hand  side  of  (3  12),  which  is  the 
energy  of  the  multiuser  signal,  we  will  denote 

fl(y',/)=r  Zj(t)zM  dt.  (3.15) 

J  -  Co 

It  follows  immediately  from  the  definition  that  these  coefficients 
satisfy  the  following  properties. 

1)  R(k  +  iK,  k  +  IK)  =  J T0s\(t)  A  wk. 

2)  R(k  +  iK,  n  +  iK)  =  R(k,  n)  for  all  i. 

3)  R(J,  l)  =  0  unless  j j  -  /j  <  K. 

The  first  property  indicates  that  each  of  the  diagonal  elements 
of  R(i,  j)  is  equal  to  the  energy  of  one  of  the  K  assigned 
waveforms.  The  second  and  third  properties  can  be  illustrated  by 
referring  to  Fig.  6  which  represents  the  symbol  epochs  of  three 
svncnronous  users  sending  strings  of  M  =  5  symbols  Each 
;vmool  period  in  Fig.  6  is  labeled  with  the  index  of  the 
orresDonaing  component  of  the  vector  d.  The  second  property 
naicates  that  the  cross -correlations  between  two  signals  depend 
miv  on  their  relative  location  (e.g.,  R( 4,  6)  =  f?(13,  15)  in  Fig 
i)  and  the  third  property  states  that  each  symbol  only  interferes 
.vnh  2K  -  2  symbols  of  the  other  users  [e.g.,  in  Fig.  6,  d9  = 
bs(2)  only  overlaps  with  d1  =  6^2),  dt  -  62( 2),  dl0  =  b{(3),  and 
dn  =  62(3)].  It  follows  from  tliese  properties  that  the  coefficients 
in  (3.15)  can  be  obtained  from  the  K  X  K  matrix  {R(k, 
n)}  t+ 1  i  whose  diagonal  elements  correspond  to  the  energy  per 
symbol  of  each  user  and  whose  off-diagonal  elements  correspond 
to  the  cross -correlations  between  the  signature  waveforms  of  each 
pair  of  users.  Using  (3.15),  the  foregoing  properties,  and  letting 
k(j  )  €  {1,  •  •  • ,  K }  be  the  modulo-/f  remainder  of  j  (i.e.,  for 
some  /,  j  -  k(J)  +  iK),  we  can  write 


MK  MK 

J  SI(t,d)dt='2%dJdlRU,  I ) 


1 

A IK 

r 

i 

*  i 

u 

r 

34 

1 

» 1 

L 

ik  r  k-i 

5  dj  |  w«(y)+2  2  dj-»S.u)‘K-n) 

» I  L  n  •»  I 


3.16) 


It  is  now  apparent  that  the  solution  to  (3.11)  entails  solving  a 
finite-horizon  deterministic  optimal  control  problem  with 
additive  costs  per  stage  for  the  linear  system  in  (3. 19),  and  with  a 
finite  admissible  control  set  A .  Therefore,  optimum  multiuser 
demodulation  is  equivalent  to  a  shortest  path  problem  in  an  M- 
stage  layered  directed  graph,  where  at  each  stage  there  are  AK~' 
states.  This  optimization  problem  can  be  solved  by  dynamic 
programming  (e.g.,  [7])  in  backward  or  forward  fashion.  In 
practice,  it  is  necessary  to  demodulate  the  transmitted  symbols  in 
real-time,  and  since  \1  is  usually  a  very  large  integer,  it  is  not 
feasible  to  wait  until  all  the  observables  have  been 

obtained  before  starting  to  make  decisions.  Therefore,  a  subopti¬ 
mum  version  of  the  forward  dynamic  programming  algorithm  is 
adopted  in  practice  whereby  each  decision  is  based  on  the  paths 
corresponding  to  the  cost-to-arrive  function  computed  a  fixed 
number  of  steps  ahead.  This  real-time  version  of  forward  dynamic 
programming  is  known  in  communication  theory  as  the  Viterbi 
algorithm  [12],  and  was  originally  devised  (without  resorting  to 
the  dynamic  programming  framework)  ft-r  decoding  convolu 
tional  codes.  The  maximum-likelihood  criterion  used  in  (3  10)  is 
not  the  only  possible  optimality  criterion.  For  example,  if  the 
objective  is  to  minimize  the  probability  of  error  for  each  user, 
then  the  multiuser  demodulator  uses  a  backward -forward 
dynamic  programming  algorithm  [49]  whereby  optimum  deci 
sions  are  based  on  the  independent  computation  of  a  cost  to  go 
and  a  cost-to-arrive  function. 


IV.  Other  Problem  Areas 

Routing  and  multiple  access  are  not  the  only  problem  areas  in 
the  field  of  communication  networks  which  control  theory  can 
help  formulate,  study,  and  solve  We  have  deliberately  chosen  to 
confine  our  attention  to  these  two  areas  in  order  to  get  across  in  a 
concise  manner  our  belief  that  the  field  of  communication 
networks  offers  a  rich  selection  of  applications  for  control  theory. 
We  would  feel  remiss,  however,  if  we  did  not  even  make  an 
attempt  to  provide  a  taste  of  some  of  the  numerous  other  design 
and  operation  issues  that,  again,  bring  forth  control  systems 
concepts  and  techniques.  For  this  purpose,  and  with  a  conscious 
effect  not  to  expand  in  depth  but  only  to  describe,  we  will  mention 
two  areas  from  point-to-point  networks  and  one  from  radio 
networks.  The  first  two  concern  flow  control  and  integrated 
switching,  respectively,  while  the  third  concerns  the  problem  of 
scheduling  transmission  in  multihop  networks.  Unlike  the  cases  of 
routing  and  multiple  access,  these  areas  have  not  yet  fully 
>enefitted  from  the  use  of  control  theoretic  approaches  although 
such  approaches  would  be  very  well  suited  to  them  indeed. 
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A.  Flow  Control 

A  stark  reality  in  the  design  of  networks  is  that  despite  the 
reduction  of  the  cost  of  memory,  storage  at  each  node  is  going  to 
be  finite.  Coupled  with  another  reality,  namely  that  data  transmis¬ 
sions  on  the  whole  continue  to  be  bursty,  it  implies  that  buffer 
overflow  may  occur  and,  along  with  it,  congestion  and  deadlocks. 
Flow  control  is  the  name  we  use  to  describe  the  collection  of 
measures  taken  to  avoid  buffer  overflow  and  highly  congested 
nodes  in  the  network.  Congestion  and  saturation  are  often  the 
consequences  of  diverging,  unstable  behavior.  Thus,  it  is  of 
interest  not  only  to  optimize  over  possible  flow  control  strategies, 
but  to  determine  their  robustness  against  disturbances  or  modeling 
inaccuracies  that  may  lead  to  unstable  behavior. 

The  control  variables  in  flow  control  problems  are  admission 
(or  blocking)  probabilities  for  messages  or  sessions  at  the  source 
node  In  practice  these  are  often  implemented  in  terms  of  a  bang- 
bang  control  strategy  known  as  window  flow  control  whereby 
input  ports  are  allowed  to  continuously  inject  messages  into  the 
network  at  the  full  desired  input  rate  until  the  number  of 
unacknowledged 6  messages  exceeds  the  value  of  the  “window 
size”  vv.  A  simple,  yet  unanswered  question  is,  what  should  the 
value  of  w  be? 

Previous  efforts  to  use  control  theory  tools  to  analyze  optimal 
flow  control  problems  include  [28]  and  [46]  where  the  optimality 
of  window  flow  control  is  proved  within  the  domain  of  a 
simplified  model,  and  [39]  where  dynamic  programming  value 
iteration  techniques  are  used  to  characterize  optimal  flow  control 
performance  An  alternative  approach  to  the  flow  control  problem 
is  to  subsume  it  into  the  static  routing  problem  considered  in 
Section  II-A  [19]:  suppose  that  for  every  source-destination  pair  a 
fictitious  direct  link  is  added  between  them.  We  can  then  interpret 
the  blocking  action  of  a  flow  control  procedure  as  a  diversion  of 
the  blocked  portion  of  the  traffic  through  this  fictitious  link  to  the 
destination.  Thus,  we  can  consider  that  no  traffic  is  blocked.  Of 
course,  in  order  to  discourage  the  use  of  this  fictitious  link  we 
must  augment  the  overall  delay  cost  function  with  a  term  that 
penalizes  appropriately  the  use  of  this  link. 

B.  Integrated  Switching 

A  revolutionary  development  in  the  field  of  networks  whose 
implementation  is  currently  under  way  is  the  combination  of  the 
capabilities  of  what  have  been  separately  developed  in  the  past  and 
called  voice  networks  and  data  networks.  Voice  is  a  commodity 
that  must  meet  different  requirements  than  data.  For  example, 
speech  signals  have  inherent  redundancy  that  make  them  quite 
robust  with  respect  to  occasional  errors  or  deliberate  compres¬ 
sion  At  the  same  time,  except  in  applications  of  voice  messaging, 
speech  signals  occur  in  the  context  of  real-time  conversations  and, 
as  such,  must  encounter  short  and,  more  importantly,  constant 
delay  On  the  o'her  hand,  data  must  preserve  their  integrity  and 
cannot  tolerate  errors;  however,  long  and  variable  delays  can  be 
often  tolerated. 

How  docs  one  design  a  single  network  that  can  handle  such 
dissimilar  commodities  with  automated  procedures?  The  natural 
course  of  events  in  the  last  decade  or  two  was  to  attempt  to  force 
data  on  primarily  voice  networks  or  to  let  voice  ride  on  what  were 
mainly  data  networks.  The  literature  is  full  of  ideas  for  baseline 
integration  that  are  mostly  heuristic  and  difficult  to  analyze.  An 
attempt  to  formulate  the  problem  of  integrated  switching  as  an 
optimization  problem  was  presented  in  [50].  In  it  simplest  form 
the  model  is  as  follows:  consider  a  single  node  in  the  network  with 
a  single  outgoing  link  on  which  incoming  voice  calls  and  data 
packets  must  be  multiplexed.  Let  IT  be  the  bandwidth  of  the 
outgoing  link.  Let  V  be  the  bandwidth  required  for  the  continu¬ 
ous,  uninterrupted  accommodation  of  a  single  voice  call.  Let, 

‘  Note  the  implicit  assumption  of  delated  feedback  information  irom  the 
destination  to  the  source  node. 


Fig.  7.  Switching-type  optimum  policy  for  integrated  switching. 


therefore,  N  =  Wi  V  be  the  maximum  number  of  calls  that  can  be 
assigned  dedicated  circuits  simultaneously  if  no  data  packets  are 
transmitted.  A  voice  call  can  either  be  accepted  (and  assigned  the 
necessary  bandwidth  V)  or  blocked.  Data  packets  can  be  stored  in 
a  buffer  facility.  If,  at  a  given  time,  there  are  /  calls  in  the  system, 
the  data  packets  can  be  served  at  the  full  rate  corresponding  to  the 
remaining  bandwidth  W  -  iV.  Such  a  switching  architecture 
represents  what  has  been  called  the  movable  boundary  idea  in 
integration.  A  natural  MDP  can  be  simply  formulated  as  follows: 
choose  the  control  action  of  blocking  or  accepting  a  call  upon 
arrival  in  order  to  minimize  the  weighted  sum  of  the  average  data 
packet  delay  and  the  call-blocking  probability.  If  we  assume  that 
both  arrival  streams  (voice  calls  and  data)  are  independent  Poisson 
processes,  that  the  call  holding  time  is  exponentially  distributed, 
and  that  the  message  lengths  are  likewise  exponential,  we  can 
apply  the  technique  described  in  Section  II  of  converting  the  MDP 
to  an  LP  and  show  that  the  optimal  policy  has  the  useful 
switching-type  form.  Namely,  if  i  is  the  number  of  ongoing  calls 
andy  the  total  number  of  data  messages  at  the  node,  the  optimal 
control  action  should  be  to  block  the  call  in  region  B  of  the  state 
space  as  shown  in  Fig.  7  and  to  accept  it  in  region  A. 

C.  Link  Scheduling 

Let  us  now  turn  our  attention  back  to  the  radio  network 
environment.  In  Section  III  the  multiple  access  channel  was 
considered  and  a  number  of  difficult  but  interesting  control 
problems  were  identified.  Throughout  that  discussion,  it  was 
assumed  that  all  terminals  are  within  a  single  transmission  hop 
from  the  destination.  In  many  radio  networks,  however,  this  is  not 
the  case.  Messages  need  to  be  relayed  via  intermediate  nodes  to 
their  final  destinations.  Thus,  the  familiar  problem  of  routing 
arises  again,  except  that  this  time  there  is  a  new  twist  to  it.  In 
point-to-point  networks,  transmissions  between  different  node 
pairs  can  take  place  simultaneously  because  there  are  dedicated, 
“hard-wired”  links  between  the  corresponding  nodes.  In  a  radio 
(or,  more  generally,  in  a  multiaccess/broadcast)  environment,  if 
the  nodes  are  densely  connected,  not  all  transmissions  can  take 
place  simultaneously  (unless  separate  dedicated  channels  or 
simultaneous  transmission  signaling  techniques  (Section  1II-B)  arc 
used).  They  must  be  scheduled  in  time  to  avoid  the  interference 
that  would  occur  otherwise. 

It  becomes  evident  that  the  mere  fact  that  the  transmission 
among  a  group  of  nodes  must  take  place  one  at  a  time  raises  the 
question  whether  the  intended  transmissions  are  routing-wise 
optimal  any  more.  Several  versions  of  this  problem  have  been 
studied  [3],  [23],  [36].  In  every  case  and  even  if  the  routing 
problem  is  sidestepped,  we  are  led  to  hard  combinatorial 
optimization  problems  where  questions  of  computational  com¬ 
plexity  and  distributed  implementation  are  of  primary  importance. 

V.  Conclusion 

It  should  be  clear  by  now  that  the  theory  of  linear  and  nonlinear 
ODtimization,  dynamic  programming,  stochastic  control,  stability 
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analysis,  and  distributed  control  have  found  interesting  applica¬ 
tions  arising  in  the  analysis  and  design  of  communication 
networks.  Unlike  other  complex  systems  that  have  been  success¬ 
fully  studied  by  control  system  theorists  in  the  past  (such  as 
chemical  plants,  flexible  aircraft,  robot  systems,  etc.),  communi¬ 
cation  networks  stand  out  in  that  the  commodity  to  be  controlled  is 
information  (including  its  transmission,  storage,  processing,  etc.). 
This  feature,  perhaps,  misleads  and  intimidates  those  who  do  not 
feel  sufficiently  inter-disciplinarian  to  tackle  these  problems.  We 
hope  that  by  having  selected  to  present  a  few  examples  in  which 
concrete,  purely  control-theoretic  problems  can  be  formulated  and 
have  been  (or  can  be)  studied  successfully,  we  may  encourage 
attention  by  the  control  community  to  this  application  area  that  is 
especially  rich  in  new  challenges. 

As  stated  from  the  outset,  we  did  not  attempt  to  survey  or 
completely  cover  the  multiple  control  facets  of  communication 
networks.  The  collection  in  this  paper  merely  represents  an  effort 
to  illuminate  a  few  selected  problem  areas  and  to  show  how 
control  techniques  apply  to  them. 


References 

11)  N.  Abramson,  "Development  of  the  ALOHANET."  IEEE  Trans. 
Inform.  Theory,  vol.  IT-31,  pp.  119-123,  Mar.  1985. 

12)  P.  O.  Bhattacharya  et  at.  "A  (not  very)  simple  dynamic  routing 
problem,"  in  Proc.  25th  AUerton  Conf.  Contr.  Commun.  Corn- 
put.,  Urbana,  1L.  1987,  pp.  998-1006. 

13|  D.  J.  Baker,  J.  Wicselthier,  and  A.  Ephrcmidcs,  "A  distribured 
algorithm  for  scheduling  (he  activation  of  links  in  a  self-organizing 
mobile  radio  network,”  in  Proc.  IEEE  Int.  Conf.  Common., 
Philadelphia,  PA,  June  1982.  pp.  2F.6.I-5, 

(4 1  M.  Bazaraa  and  C.  Shctty,  Nonlinear  Programming:  Theory  and 
Algorithms.  New  York:  Wiley.  1979. 

(5|  D.  Bertsekas  and  R.  Gallager,  Data  Networks.  Englewood  Cliffs, 
NJ:  Premice- Hal  I,  1987. 

[6|  D.  P.  Bertsekas,  "Distributed  dynamic  programming,"  IEEE  Trans. 
Automat.  Contr.,  vol.  AC-27,  pp.  610-616.  June  1982. 

(7|  D.  P.  Bertsekas,  Dynamic  Programming:  Deterministic  and  Sto¬ 
chastic  Models.  Englewood  Cliffs,  NJ:  Prentice-Hall,  1987. 

18)  C.  Buyukkoc,  P.  Varaiya,  and  J.  Walrand,  "The  pc  rule  revisited," 
Advances  Appl.  Probability,  vol.  17,  pp.  237-238,  1985. 

[9|  C.  A.  Courcoubetis,  "Optimal  control  of  a  queueing  system  wiih 
simultaneous  service  requirements."  IEEE  Trans.  Automat.  Contr., 
vol.  AC-32,  pp.  717-727,  1987. 

|I0|  A.  Ephremides,  P.  Varaiya,  and  J.  Walrand,  "A  simple  dynamic 
routing  problem,”  IEEE  Trans.  Automat.  Contr.,  vol.  AC-25,  pp. 
690-693,  Aug.  1980. 

(Ill  G.  FayoUe,  E.  Gelenbe,  and  L.  Labctoulle,  "Stability  and  optimal 
control  of  the  packet  switching  broadcast  channel,"  JACM,  vol.  24, 
pp.  375-386,  1977. 

(I2)  G.  D.  Forney,  "The  Viterbi  algorithm,”  Proc.  IEEE,  vol.  61,  pp. 
268-278,  Mar.  1973. 

1 1 3]  M.  Frank  and  P.  Wolfe,  "An  algorithm  for  quadratic  programming,” 
Naval  Res.  Logist.  Quat.,  vol.  3,  pp.  149-154,  1956. 

114]  L.  Fratta,  M.  Gerla,  and  L.  Kleinrock,  "The  flow  deviation  method: 
An  approach  to  store-and-forward  communication  network  design,” 
Networks,  vol.  3,  pp.  97-133,  1973. 

115]  T.  Gal,  Postoptimal  Analyses,  Parametric  Programming,  and 
Related  Topics,  Englewood  Cliffs,  NJ:  McGraw-Hill,  1979. 

[I6|  R.  Gallager,  "A  minimum  delay  routing  algorithm  using  distributed 
computation,”  IEEE  Trans.  Commun.,  vol.  COM-23,  pp,  73-85, 
1977. 

1 17|  S.  Ghez,  S.  Verdu,  and  S.  Schwartz,  "Stability  properties  of  slotted 
Aloha  with  multipacket  reception  capability,”  IEEE  Trans.  Automat. 
Contr.,  vol.  33,  pp.  640-649,  July  1988. 

(I8J  S.  Ghez,  S.  Verdu,  and  S.  Schwartz,  "Optimal  decentralized  control  in 
the  multipacket  channel,”  IEEE  Trans.  Automat.  Contr.,  to  be 
published. 

|I9]  S.  J  Golestaani,  "A  unified  theory  of  flow  control  and  routing  on  data 
communication  networks,"  Ph.D.  dissertation,  Mass.  Inst.  Tcchnol., 
Cambridge,  1980. 

(20]  B.  Hajek  and  T  van  Loon,  "Decentralized  dynamic  control  of  a 
multiaccess  broadcast  channel,"  IEEE  Trans.  Automat.  Contr.,  vol. 
AC-27,  pp.  559-569.  June  1982, 

121 1  B  Hajek,  "Hitting-time  and  occupation  lime  bounds  implied  by  drift 
analysis  with  applications,"  Adv.  Appl.  Prob.,  vol.  14,  pp.  502-525, 

1982. 

122]  ,  "Optimal  control  of  two  interacting  service  stations,"  IEEE 

Trans.  Automat.  Contr.,  vol  AC  29,  pp  491-499,  June  1984. 

(23]  B  Hajek  and  G  Sasaki,  "Link  scheduling  in  polynomial  time,”  IEEE 
Trans.  Inform.  Theory,  vol.  34,  pp.  910-917,  Sept  1988. 


941 


[24)  J.  M.  Harrison,  "Dynamic  scheduling  of  a  multiclass  queue  Discount 
optimality,”  Oper.  Res.,  vol.  23.  pp.  270-282,  1975. 

[25)  F  P  Kelly.  "Stochastic  models  of  computer  comniuniLUtions  sys¬ 
tems,"  J.  R.  Statist.  Soc.  B,  vol.  47.  pp.  379-395.  1985. 

[26)  G.  P.  Klimov,  "Time  sharing  service  systems,"  Theory  Probability 
Appl.,  vol.  19,  pp.  532-551,  Sept.  1974;  see  also  vol.  23,  pp.  314- 
321,  June  1978. 

[27)  H  Kushner,  Introduction  to  Stochastic  Control.  New  York.  Holt. 
Rinehart,  and  Winston,  1971. 

[28)  A.  Lazar,  "Optimum  flow  control  of  a  class  of  queueing  networks  in 
equilibrium,”  IEEE  Trans.  Automat.  Contr.,  vol.  AC-28,  pp.  1001- 
1007,  Nov.  1983. 

[29J  W  Lin  and  P  R  Kumar.  "Optimal  control  of  a  queueing  system  with 
two  heterogeneous  servers."  IEEE  Trans.  Automat.  Contr..  vol. 
AC-29,  pp.  696-703,  Aug.  1984. 

[30]  S.  A.  Lippman.  "Semi-Markov  decision  processes  with  unbounded 
rewards,"  Management  Sci..  vol.  19.  pp.  717-731.  1973. 

[31]  V  A.  Mikhailov,  "Geometrical  analysis  of  the  stability  of  Markov 
chains  in  R ",  and  its  application  to  throughput  evaluation  of  the 
adaptive  random  multiple  access  algorithm."  Problems  of  Inform 
Transmission,  vol.  24,  pp.  47-56,  Jan. -Mar.  1988. 

132]  J.  Mosely  and  P.  Humblet,  “A  class  of  efficient  contention  resolution 
algorithms  for  multiple  access  channels,"  IEEE  Trans.  Commun.. 
vol.  COM-33,  pp.  145- 151,  Feb.  1985. 

[33]  P.  Nain  and  K.  W.  Ross.  "Optimal  priority  assignment  with  hard 
constraint.”  IEEE  Trans.  Automat.  Contr.,  vol.  AC-31,  pp.  883- 
888.  1986. 

[34]  C.  H.  Papadimitrou  and  K  Steiglitz.  Combinatorial  Optimization. 
Algorithms  and  Complexity.  Englewood  Cliffs.  NJ.  Prenticc-Hall. 

1982. 

[35]  S.  Parekh,  F.  Schoutc,  and  J.  Walrand,  “Instability  and  geometric 
transience  of  the  Aloha  protocol."  in  Proc.  26th  Conf.  Decision 
Contr.,  Los  Angeles.  CA,  Dec.  1987,  pp.  1073-1077. 

[36]  M.  J.  Post,  P.  E.  Sarachik.  and  A.  S.  Kcrshenbaum.  "A  biased  greedy 
algorithm  for  scheduling  multi-hop  radio  networks."  in  Proc.  Conf. 
Inform.  Sci.  Syst.,  Johns  Hopkins  Univ..  Baltimore,  MD.  1985.  pp 
564-672. 

[37]  R.  T.  Rockafcllar,  Convex  Analysis.  Princeton.  NJ:  Princeton 
University  Press,  1970. 

]38)  Z.  Rosberg,  P.  Varaiya.  and  J.  Walrand.  "Optimal  control  of  service  in 
tandem  queues,"  IEEE  Trans.  Automat.  Contr.,  vol.  AC-27,  pp. 
600-610.  June  1982. 

[39]  Z.  Rosberg  and  I.  Gopal.  "Optimal  hop-by-hop  flow  control  in 
computer  networks,"  IEEE  Trans.  Automat.  Contr.,  vol  AC-31, 
pp.  813-822,  Sept.  1986. 

[40J  W.  Rosberg  and  D  Towsley.  “On  the  instability  of  slottcd-ALOHA 
multiaccess  algorithm."  IEEE  Trans.  Automat.  Contr..  vol  AC-28, 
pp.  994-996,  Oct.  1983. 

[41 J  R.  E.  Tarjan,  Data  Structures  and  Network  Algorithms  (CBMS-NSF 
Reg.  Conf.  Scries  in  Appl.  Math.  no.  7).  Philadelphia.  PA.  SIAM. 

1983. 

[42J  J.  N.  Tsitsiklis  and  D.  P.  Bertsekas,  "Distributed  asynchronous 
optimal  routing  in  data  networks,”  IEEE  Trans.  Automat.  Contr.. 
vol.  AC-31,  pp.  325-332.  Apr.  1986. 

[43]  J.  N.  Tsitsiklis.  "Analysis  of  a  multiaccss  control  scheme,"  IEEE 
Trans.  Automat.  Contr.,  vol.  AC-32,  pp.  1017-1020,  Nov  1987 

[44]  B.  S.  Tsybakov  and  N.  B.  Likhanov,  "An  improved  upper  bound  on 
capacity  of  the  random  multiple-access  channel,"  Problemi  Pederachi 
Informalsii.  vol.  23.  pp.  64-78.  1987. 

[45]  R.  Tweedie,  "Sufficient  conditions  for  ergodicity  and  recurrence  of 
Markov  chains  on  a  general  state  space,”  Stoch.  Proc.  Appl.  vol.  3, 
pp.  385-403,  1975. 

[46]  F.  Vakil  and  A.  Lazar,  "Flow  control  protocols  for  integrated 
networks  with  partially  observed  voice  traffic,"  IEEE  Trans.  Auto¬ 
mat.  Contr.,  vol.  AC-32,  pp.  2-14,  Jan.  1987. 

[47J  S.  Verdu.  “Minimum  probability  of  error  for  asynchronous  Gaussian 
multiple-access  channels,"  IEEE  Trans.  Inform.  Theory,  vol.  IT-32, 
pp.  85-96,  Jan.  1986. 

[48]  - .  "Computation  of  the  efficiency  of  the  Moslcy-Humblct  conten¬ 

tion  resolution  algorithm.  A  simple  method,"  Proc.  IEEE,  vol  74. 
pp.  613-614,  Apr.  1986. 

[49[  S  Verdu  and  H.  V.  Poor,  "Abstract  dynamic  programming  models 
under  commutativity  conditions,”  SIAM  J.  Contr.  Optimiz..  vol.  4, 
July  1987. 

[50]  I.  Viniotis  and  A.  Ephremides,  “Optimal  switching  of  voice  and  data  at 
a  network  node,”  Proc.  26th  CDC,  Los  Angeles,  CA.  Dec.  1987.  pp. 
1504-1507. 

[51]  J.  Viniotis,  Ph.D.  dissertation,  Univ.  Maryland,  College  Park.  MD. 
1988. 

[52]  J.  Walrand.  "A  note  on  optimal  control  of  a  queueing  system  with  iwo 
heterogeneous  servers,"  Syst.  Contr.  Lett.,  vol.  4.  pp.  131-134, 

1984. 

[53[  - ,  An  Introduction  to  Queueing  Networks.  Englewood  Cliffs, 

NJ:  Prentice-Hall.  1988. 

(54[  R.  Weber,  "On  the  optimal  assignment  of  customers  to  parallel 
servers."  J.  Appl.  Prob.,  vol.  15,  pp.  406-413,  1978. 

[55]  Z.  Wu,  P.  B.  Luh,  S.  Chang,  and  D.  A.  Castanon,  "Optimal  control  of 


IEEE  TRANSACTIONS  ON  AUTOMATIC  CONTROL.  VOL.  34.  NO.  9.  SEPTEMBER  1989 


M2 


■  uueueing  sysiem  wun  two  interacting  service  stations  and  three  President  of  the  Information  Theory  Society  and  on  the  Board  of  Governors  of 
lasses  of  impatient  tasks."  IEEE  Trans.  Automat.  Contr.,  vol.  33,  the  Control  Systems  Society.  He  has  been  Associate  Editor  of  the  IEEE 
>p.  42-49,  1988.  TRANSACTIONS  ON  AUTOMATIC  CONTROL  and  General  Chairman  of  major 

EEE  Conferences. 


vnthonv  Ephremides  (S'68-M'7l-SM’77-F’84) 
vas  com  in  Athens.  Greece,  in  1943.  He  received 
he  Pli.D.  degree  in  electrical  engineering  from 
•rinceton  University,  Princeton,  NJ,  in  1971. 

Ie  has  been  with  the  University  of  Maryland, 
-ollege  Park,  since  1971.  He  has  also  spent 
emesiers  on  leave  at  M.I.T..  the  University  of 
California.  Berkeley,  and  ETH,  Zurich.  He  is  active 
n  professional  consulting  as  President  of  Pontos, 
>oc.  Currcntlv,  his  research  interests  lie  in  the  areas 
u  communication  svstems.  performance  analysis, 
uoaeling;  optimization,  simulation,  and  design. 

Jr.  Ephremides  is  Director  of  Division  X  of  the  IEEE  and  has  served  as 


lergio  Verdu  (S‘80-M'84-SM‘8  is  bom  in 
tarcelona.  Catalonia.  Spain,  in  1958.  He  received 
he  Telecommunication  Eng.  degree  from  the 
’olvtechnic  University  of  Barcelona  in  1980  and  the 
’h.D.  degree  in  electrical  engineering  from  the 
University  of  Illinois  at  Urbana-Champaigri  in 
1984. 

Upon  completion  of  his  doctorate  he  joined  the 
faculty  of  Princeton  University.  Princeton,  NJ. 
where  he  is  currently  an  Associate  Professor  of 
Electrical  Engineering.  His  current  research 
interests  are  in  the  areas  of  multiuser  communication  and  information  theory. 

Dr.  Verdti  is  a  recipient  of  the  National  University  Prize  of  Spain,  the 
Rheinstein  Outstanding  Junior  Faculty  Award  of  the  School  of  Engineering 
and  Applied  Science  at  Princeton  University,  and  the  NSF  Presidential  Young 
Investigator  Award.  He  is  currently  serving  as  Associate  Editor  of  the  IEEE 
TRANSACTIONS  ON  AUTOMATIC  CONTROL,  and  as  a  member  of  the  Board 
of  Governors  of  die  IEEE  Information  Theory  Society. 


T-AC/34/1 1//3G788 


Optimal  Decentralized  Control  in  the  Random  Access 

Multipacket  Channel 


Sylvie  Ghez 
Sergio  Verdu 
Stuart  C.  Schwartz 


Reprinted  from 

IEEE  TRANSACTIONS  ON  AUTOMATIC  CONTROL 
Vol.  M,  No.  11,  November  1919 


IEEE  TRANSACTIONS  ON  AUTOMATIC  CONTROL.  VOL.  M.  NO.  1 1.  NOVEMBER  19*9 


1153 


Optimal  Decentralized  Control  in  the  Random 
Access  Multipacket  Channel 

SYLVIE  GHEZ,  student  member,  ieee,  SERGIO  VERDU,  senior  member,  ieee,  and 
STUART  C.  SCHWARTZ,  senior  member,  ieee 


Abstract— A  decentralized  control  algorithm  (s  sought  that  maximizes 
(be  srabllliy  region  of  the  Infinite-user  slotted  multipacket  channel  and  Is 
easily  Implementable.  To  ibis  end,  the  perfect  stale  information  case 
where  the  stations  can  use  the  Instantaneous  value  of  the  backlog  to 
compute  the  retransmission  probability  Is  studied  first.  The  best  through¬ 
put  possible  for  a  decentralized  control  protocol  is  obtained,  as  well  as  aa 
algorithm  that  achieves  It.  Those  results  are  then  applied  to  derive  a 
control  scheme  when  the  backlog  is  unknown,  which  is  the  case  of 
practical  relevance.  This  scheme,  based  on  a  binary  feedback,  is  shown  to 
be  optimal  given  some  restrictions  on  the  channel  multipacket  reception 
capability. 


'  I.  Introduction 

MOST  studies  on  random  access  communications  rely  on  the 
assumption  that  when  two  or  more  packets  overlap,  all  the 
information  that  was  sent  is  irremediably  lost,  hence  the  need  to 
repeat  all  transmissions  at  some  later  time.  This  is  actually  a 
pessimistic  point  of  view,  since  there  are  many  examples  of 
random  access  systems  where  one  or  more  packets  may  be 
successful  in  the  presence  of  other  simultaneous  transmissions.  In 
order  to  represent  such  random  access  systems,  a  model  for  a 
channel  with  multipacket  reception  capability  has  been  developed 
in  [6]-[8].  We  consider  a  slotted  channel  with  an  infinite 
population  of  users,  and  we  assume  that  the  probability  of  having 
k  successes  in  a  slot  where  there  are  n  transmissions  depends  only 
on  the  collision  size  n 

tnk  =  P[k  packets  are  correctly  received|n  are  transmitted] 

(nz  1,  Oskzn). 

We  define  the  reception  matrix  as 


This  model  can  be  applied  to  channels  with  capture  [l]-(3],  (10), 
[16],  [18],  [20],  [23],  [26],  [28],  [34]  and  to  systems  using 
CDMA  [22],  [24],  [29].  It  is  also  relevant  for  many  other 
applications,  such  as  systems  with  multiuser  detectors  [33]  or,  for 
instance,  the  channel  studied  in  [17],  [31].  For  more  details  about 
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this  model,  the  leader  is  referred  to  [6]  and  [8].  Denoting  by  C„  = 
1116  avera8e  number  of  packets  correctly  received  in 
collisions  of  size  n,  we  assume  that  the  limit  C  =  lim,,-*  C„ 
exists,  as  is  usually  the  case  with  models  of  practical  interest.  It 
has  been  proved  in  [8]  that  the  Aloha  random  access  algorithm  has 
a  maximum  stable  throughput  q0  =  C  in  the  multipacke  channel. 

Decentralized  control  strategies  have  been  shown  [11],  [12], 
[19],  [25],  [30]  to  stabilize  the  slotted  Aloha  algorithm  in  the  case 
of  the  usual  collision  channel,  hence,  it  is  reasonable  to  expect  that 
when  those  strategies  are  used  in  the  multipacket  channel,  the 
resulting  throughput  will  be  higher  than  rj0-  We  consider  schemes 
of  the  form 

Pn  =  F(Sn) 

Sn*l~G(S„,  Zn)  (1) 

where  p„  is  the  retransmission  probability  in  slot  n,  S„  is  an 
estimate  of  the  backlog  Xn  at  the  beginning  of  slot  n,  and  Z„  is  the 
feedback  at  the  end  of  slot  n.  The  number  of  new  packets  arriving 
during  slot.**  A„  is  assumd  to  form  a  sequence  of  l.i.d.  random 
variables  with  probability  distribution  P[An  -  £]  =  \k(k  >  0), 
such  that  the  mean  arrival  rate  X  «  I", ,  n\„  is  finite.  Each  of  the 
An .  i  new  packets  that  arrived  during  slot  n  -  1  is  transmitted  in 
slot  n  with  probability  p„. 

As  in  the  case  of  conventional  channels,  it  is  useful  to  study  first 
the  case  of  control  with  perfect  state  information  where  the  value 
of  the  backlog  is  given  to  the  users  prior  to  the  selection  of  the 
retransmission  probability.  To  keep  track  of  the  exact  value  of  the 
backlog,  a  central  controller  is  usually  necessary,  which  is  an 
unreasonable  requirement  for  most  practical  random  access 
channels.  However,  the  study  of  the  perfect  state  information  case 
allows  us  to  determine  an  upper  bound  to  the  best  throughput 
achievable  by  any  decentralized  control  of  the  form  (1),  and 
suggests  a  simple  implementation.  Those  results  are  in  turn  helpful 
to  derive  control  protocols  in  the  case  where  the  backlog  is 
unknown.  This  is  done  in  Section  III  where  we  consider  a  backlog 
estimate  which  is  recursively  updated  using  the  binary  feedback 
empty/nonempty.  In  addition,  it  is  assumed  throughout  the  paper 
that  each  station  is  informal  when  its  packet  is  successfully 
received.  It  is  proved  that  provided  a  certain  condition  on  the 
reception  matrix  holds,  the  throughput  achievable  with  this  type  of 
feedback  is  the  same  as  the  perfect  state  information  throughput. 
This  condition  is  verified  for  most  multipoint-to-point  channels  of 
practical  interest. 

In  a  paper  whose  translation  appeared  only  very  recently  [19] 
(after  our  work  [7]),  Mikhailov  has  derived  sufficient  conditions 
for  stability  and  instability  of  two-dimensional  Markov  chains. 
Although  this  was  meant  to  be  used  for  decentralized  control 
schemes  in  the  usual  collision  channel,  this  approach  is  powerful 
enough  to  be  applied  to  the  multipacket  channel.  In  Section  IV  we 
show  by  using  Mikhailov’s  result  that  the  scheme  presented  in 
Section  III  is  stable  under  weaker  assumptions.  However,  only  a 
weaker  form  of  stability  can  be  proved  in  this  way. 
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n.  Control  of  the  Multipacket  Channel  with  Perfect 
State  Information 

In  this  section  we  assume  that  all  the  users  know  the  value  of  X„ 
at  the  beginning  of  slot  n,  and  we  let  the  retransmission 
probability  be  a  function  of  the  exact  value  of  the  backlog,  i.e. ,  pn 
=  F(X„).  In  this  ideal  case,  the  system  is  much  simpler  to 
analyze  than  in  the  general  case  (1)  since  (AfB)„ao  is  a  homogene¬ 
ous  Markov  chain.  Our  goal  is  to  determine  the  optimal  control 
function  F *  that  yields  the  largest  ergodicity  region,  and  the 
corresponding  throughput,  denoted  by  ijc.  For  instance,  it  is  well 
known  [4]  that  for  the  usual  collision  channel  with  the  access  rule 
in  effect  here,  F*(X„)  =  l/X„  is  the  retransmission  probability 
that  minimizes  the  drift  at  each  step,  resulting  in  an  ideal 
throughput  of  r;c  =  e~l. 

First  note  that  all  the  results  herein  are  valid  provided  that  the 
backlog  Markov  chain  (Xn,  Sn)„i0  corresponding  to  a  control  (1) 
is  irreducible  and  aperiodic.  It  can  be  easily  checked  that  for  both 
access  rules  considered  in  this  paper  (see  below),  as  well  as  all  the 
algorithms,  a  simple  set  of  sufficient  conditions  for  irreducibility 
and  aperiodicity  is 

a)  Xo^fcO 

o* 

b)  Xo+2  X,«*<1 

n  «  I 

c)  <io^O 

which  are  analogous  to  the  conditions  for  the  open-loop  system 
studied  in  [6],  The  theorem  below  gives  the  best  throughput 
possible  for  a  control  protocol  (1). 

Theorem  1:  There  exists  a  retransmission  probability  p*  that 
minimizes  the  expected  backlog  increase  when  the  backlog  is 
equal  to  n. 

With  such  a  retransmission  probability,  the  system  is  stable  for 
X  <  ijc  and  unstable  for  X  >  ijc,  with 


Vc  =  supe-*  V  c„^~  . 

n.i  "! 

Proof  of  Theorem  1:  The  proof  is  based  on  standard  drift 
analysis  techniques.  (Xn),t0  is  a  homogeneous  Markov  chain 
which  evolves  according  to 

Xi*  i  —Xi+A,—  S,  (2) 

where  I,  is  the  number  of  packets  successfully  transmitted  in  slot 
t.  he  system  is  defined  to  be  stable  if  (X,)li0  is  ergodic  and 
ur<  ible  otherwise.  Let  d„  be  the  drift  of  X,  at  state  n:  d„  = 
-  X,\X,  =  «].  We  have  0  £  E,  <;  X„  and  if  we  denote 
by  p  the  retransmission  probability  used  in  slot  t,  then  for  n  £  1 , 
the  probability  of  having  k  successes  is  given  by 

’fS,=A:lA',=/jl  =  X  {  n 'j  PJ(l-p)n~Jtij  (lsArsn).  (3) 

t  then  follows  from  (2)  that  the  backlog  drift  at  state  n  £  1  is 
liven  by 

'i=X-  X  k  2  (  pJ(l-p)n-Jtjk 

,.i  i.t  \J  / 

'-t{j)p'(l-p)'-'Cj  (4) 


which  becomes  d„(p)  =  X  -  tn(p)  if  we  define  t„{p)  to  be  the 
average  number  of  successes  given  the  backlog  n  and  the 
retransmission  probability  p 

'»0>)=2  (f)  p'V-pr-JCr  (5) 

Since  tn(p)  is  a  polynomial  on  the  compact  {0,1],  it  achieves  its 
maximum  and  we  can  define 

p»  =  arg  max  r„(p)  =  arg  min  dn(p). 

p€(0.l)  pS(O.I) 

We  now  proceed  to  compute  the  limit  of  the  drift  when  the 
retransmission  probability  p *  is  used.  We  show  that 

°*  xn 

Urn  t„(p*)^ sup  e~x  V  C„  —  =  sup  t(x).  (6) 

»-*  xto  n\  xto 

Let  us  first  assume  that  C  <  +  oo. 

Property  I: 

lim  t(x)-C. 

x-*o* 

We  have  for  n  >  M 

|/(*)-C|se-»C+e-*f  ~|C,-C|+  £  ^\Cn-C\. 

|  *«A/+  I 

(7) 

P>ck  e  >  0  and  fix  M such  that  |C„  -  C|  <  e  for  n  >  M.  Then  if 
Be  is  an  upper  bound  on  the  sequence  (Cn).E!,  (7)  yields 

M  n 

\t(x)-C\se-xC+2Bce'x  V  -  +  < 

,  n\ 

n  •  I 

and  the  right-hand  side  of  this  last  equation  goes  to  zero  as  x  goes 
to  infinity. 

Property  2:  For  all  e  >  0,  there  exists  A  >  0  such  that  for  all 
np  >  A,  1 t„( p)  -  C|  <  e.  We  have 

| t„(p) - C| s 2  ^  pJ(  1  -p)"‘'|C/-Cj  +  (1  -  p)*C. 

Choosing  Af  as  for  Property  1  we  get 

\Up)-c\siBt  2  (  n  )/»'(i 

J- o  \J  / 

Let  us  denote  by  R*  the  random  variable  corresponding  to  the 
number  of  retransmissions  in  a  slot  given  that  the  backlog  is  equal 
to  n.  We  have 

or  r\D  >  2M.  Then  from  the  Chebyshev  inequality 

P{Rn*M}<.—  (8) 

np 

and  Property  2  follows. 

Property  3:  t„(x/n)  converges  uniformly  to  t(x)  on  any 
compact  [0,  /4]. 
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ix  e  >  0  and  choose  A/such  that  2“  uAiC,/j\  <  t.  Then  for 
i  >  M  +  1  and*  G  fOMl 


M 


tin-  1)  •  ••  ( n-j+  1) 

i 1 


+  2e. 


:ince lim„-<, n(n  -  1  )•••  (n  -  j  +  l)/nJ  =  1  for  1  £  j  £  M, 
i  is  enough  to  show  that  (1  -  x/n)"~J  converges  uniformly  to  e~x 
or  1  £  i  £  M.  We  have 


n-i 


~e-x£e-x[eXJ/n-l]£eAW''- 1. 


)n  the  other  hand,  for  n  >  A, 


(9) 


l_fy'/_e-*>^l_£y_e-xa.e-r(eA  +  /»log(t-A/»)_l| 

eMl-V-l  (10) 

;na  uniform  convergence  follows  from  (9)  and  ( 10). 

'rODerty  4:  t„(x/n)  converges  uniformly  to  t(x)  for  raO. 
•ix  e  >  0.  From  Properties  1  and  2  we  can  fix  A  such  that: 
i  for  all  np  >  A,  \t„(p)  -  C\  <  e, 
i)  for  all  x  >  A,  |/(x)  -  C|  <  e. 

"hen  we  distinguish  two  cases.  If  x  G  [0,  A],  then  from 
TODerty  3  there  exists  N  such  that  for  all  n  >  N,  \tn(x/ri)  - 
(x)|  <  6.  If  on  the  other  hand  x  €  (A,  +  oo),  we  have 


£ 


ti 


+  |r(x)-C|:S2« 


(11) 


rom  i)  and  ii). 

hus.  we  have  shown  that  when  C  is  finite,  tn(x/n)  converges 
miformlv  to  t(x)  for x  >  0.  It  follows  that  lim,--  supXJ.0  ti(x/n) 
■UD,i0  i(x)  and  so  (6)  is  proved. 

inallv,  we  show  that  (6)  holds  when  C  =  +°°.  Choose  A 
:roitrarilv  large  and  M  such  that  Cn  >  A  for  n  >  M.  Then  for  n 
A/ 


&AU -/>[/?,«;  A/]). 


rom  (8)  P[R„  £  A/]  is  arbitrarily  small  for  nx/n  =»  x  large 
nougn.  Therefore,  supxlo  ‘i(x/n)  =  +  »  and  lim„-c  =■ 
oo.  since  it  is  clear  that  if  C  =  +  oo,  then  supxl0  t(x)  =  +  oo, 
6)  holds. 

rom  the  eauality  lim,-.,  d„( pf)  =  X  -  supXJ.0  t(x)  and  Pakes 
emma  in  f2IJ,  it  follows  that  if  lun„-»  C„  =  +  oo,  then  limn-„ 
/,( p*)  =  -  oo,  and  the  system  is  always  stable,  whereas  if  lim,-» 
<  +oo,  then  (A^^to  is  ergodic  for  X  <  =  supxl0  t(x). 

vlso.  it  is  shown  in  the  Appendix  that  Kaplan’s  condition  holds 
or  this  svstem  when  the  sequence  (C„)„ai  is  bounded,  thus  from 
'apian’s  result  (13),  the  backlog  Markov  chain  is  nonergodic 
vnen  X  >  rif.  □ 

t  is  intuitively  obvious  that  no  decentniued  control  algorithm 
u  the  form  (1)  can  have  a  maximum  stable  throughput  larger  than 
if.  i  he  theorem  below  gives  a  rigorous  proof  of  this  fact  and  also 
uows  that  this  throughput  can  be  achieved  with  a  control  which  is 
uucn  simpler  than  p*. 

heorem  2:  The  best  throughput  achievable  by  a  decentralized 
:omroialgorithm(l)isi?c  =  supXJ.oe‘'S"., xVnl  C„.  Ifi?*  >  C 
dm,-»  C„,  then  there  exists  a  constant  A  >  0  such  that  the 
omroi  d,  =  A/X,  for  X,  >  A  yields  the  optimal  throughput  rjc. 


Proof  of  Theorem  2:  To  prove  the  first  part  of  the  theorem 
we  use  a  result  of  [27]  which  is  a  generalization  of  Kaplan’s 
"heorem.  If  d,  =  F(S,)  and  S/+i  =  G(S,,  Z,),  consider  the 
Markov  chain  (X,,  S,)  and  the  Lyapunov  function  V(n,  s)  =  n. 
Assume  that  X  >  tjc.  Then 

z[V(X,.u  S,+1)-  V(X„  S,)|Af,  =  /»,  S,=s\ 


F(s)i(l-F(s))"^C, 


-dAP*) 


(12) 


for  all  n  large  enough  and  all  s.  Therefore,  the  drift  of  V  is  strictly 
kx>sitive  outside  a  finite  subset  of  the  state  space.  Since  it  is  shown 
•n  the  Appendix  that  the  generalized  Kaplan’s  condition  is 
verified,  it  is  enough  to  conclude  that  (X,,  S,)  is  nonergodic. 
Hence,  ije  is  indeed  the  best  throughput  achievable  by  any 
decentralized  control  algorithm  of  the  form  (1). 

l'o  prove  the  second  part  of  the  theorem,  we  need  the  following 
property. 

Property  5:  If  for  all  x  >  0,  t(x)  <  supxi0  r (jt),  then  supli0 
Ax)  =  C. 

!fsupxl0  t(x)  =  +  oo,  it  is  easily  seen  that  C  =  +».  lfsupxl0 
.'(*)  <  +»,  then  C  <  +oo.  Consider  a  sequence  (*„)„*,  of 
nonnegative  reals  such  that  lim„-»  t(x„)  =  supxat0  t(x).  If  (xB)nl, 
was  bounded  above  by  K  <  +  oo,  we  would  have  for  all  n  >  1, 
•'(*«)  ^  supx6(0Jn  t(x),  and  in  the  limit  supxi0  t(x)  -  supx6|0,/n 
!(x).  Then  there  would  exist  Xq  G  [0,  Af]  such  that  t(Xo)  =  supx£0 
!(x),  which  is  a  contradiction.  Therefore,  (*„)„*,  is  unbounded, 
+nd  one  can  build  a  subseauence  (x„k)kii  such  that  lim*-„  x„t  = 
+  oo.  We  still  have,  of  course,  lim*-*  t(x„k)  =  supxi0  t(x),  but 
on  the  other  hand,  we  have  lim*-„  t(x„t)  =  limx-„  r(x).  From 
Property  1  in  the  proof  of  Theorem  1,  limt-.„  t(x)  =s  C  and 
.’roperty  5  follows. 

Thus,  if  7C  >  C,  then  t(x)  achieves  its  supremum  at  some  finite 
positive  real  A.  Let  us  consider  the  control  p,  =  A/X,  for  X,  > 
A.  (Note  that' the  value  of  the  retransmission  probability  is  left 
unspecified  for  A",  <  A  because  it  does  not  affect  the  throughput.) 
Then  from  (4)  d„  =  X  -  t„(A/n),  and  from  Property  3  in  the 
proof  of  Theorem  1  limB-.«  d„  =  X  -  t(A).  Then  it  follows  from 
'21]  that  (X,)i>0  is  ergodic  if  X  <  t(A)  and  from  [13]  and  the 
\ppendix  that  (X,),^o  is  nonergodic  if  X  >  t(A).  Thus,  the 
maximum  stable  throughput  of  the  system  is  t(A  )  =  supx*0  t(x ) 

=  7f.  □ 

Note  that  the  dosed-looD  throughput  obtained  in  Theorems  1 
:oa  2  can  be  interpreted  as  ijc  =  sup^.^,^,0  £1C,V],  that  is  as  the 
supremum  over  x  of  the  expected  value  of  CN  if  IV  is  a  Poisson 
distributed  random  variable  with  mean  x.  Note  that  if  we  were  to 
follow  the  popular  approximation  [I],  [2],  [10],  [16],  [18],  [24], 
[26]  that  assumes  that  the  number  of  transmissions  in  each  slot,  N. 
is  Poisson  distributed,  and  if  we  could  choose  any  positive  number 
as  the  mean  of  Afby  regulating  the  retransmission  probability,  ihe 
throughput  would  be  equal  to  the  average  number  of  successes  per 
slot,  E[CN),  maximized  over  the  mean  of  N.  As  in  the  usual 
collision  channel,  a  wrong  analysis  leads  to  a  correct  conclusion. 
Several  examples  are  gathered  in  Table  I  (see  [8]  for  details). 

Probably  tire  most  important  conclusion  of  this  section  is  that  in 
general  it  is  not  necessary  to  compute  the  exact  value  ofp*,  which 
would  require  a  large  amount  of  on-line  computations,  and 
seriously  hinder  any  application  of  Theorem  1  to  the  case  where 
the  backlog  is  unknown.  Two  cases  may  occur.  If  t(x)  does  not 
attain  its  supremum,  from  Property  5  in  the  proof  of  Theorem  2, 
we  have  »?f  =  7 0  =  C(e.g.,  this  happens  in  the  model  developed 
in  [6]  for  mobile  users  with  pairwise  transmissions).  In  this  case 
no  throughput  improvement  can  be  achieved  by  varying  the 
retransmission  probability,  and  therefore  it  is  enough  to  restrict 
attention  to  the  open-loop  strategy  studied  in  [8].  On  the  other 
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ABLE  I 

JPEN-LOOP  AND  CLOSED-LOOP  THROUGHPUTS  FOR  SEVERAL 
4ULTIPACKET  CHANNELS 
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iand.  if  there  exists  A,  0  <  A  <  +  «,  such  that  t(A)  =  supx*0 
Oc),  then  we  have  shown  in  the  proof  of  Theorem  2  that  the 
ontroi  d,  =  A/X,  for  X,  2  A  yields  a  maximum  stable 
nroughput  t(A )  =  Vc,  meaning  that  the  system  is  optimal.  Hence, 
uuv  A  has  to  be  computed,  and  this  can  be  done  before  starting 
ne  ooeration  of  the  system. 

vl  though  in  most  practical  applications  (C„)„  2  1  does  have  a 
imit.  it  is  worth  noticing  that  Theorem  1  can  be  generalized  to  the 
ase  wnere  C  does  not  exist.  It  can  be  shown  19]  that  if  the  drift  is 
iitnimized  at  each  steo,  then  the  system  is  stable  for  A  <  supXJ.0 
(x)  and  unstable  for  X  >  supxl0  t(x)  +  lim*-„  sup  C*  -  lim,-« 
nf  C„.  As  in  the  open-loop  system  when  (Cx)xll  does  not  have  a 
imit,  nothing  more  can  be  said  about  the  throughput  without 
urther  information  on  the  seauence  (CJ,*,.  But  the  main 
■rawback  in  such  a  case  is  that  there  mav  not  exist  any  control  p, 
h/X*  that  yields  the  optimal  throughput, 
he  access  rule  for  new  packets  that  we  have  been  considering 
o  iar  is  usually  referred  to  as  delayed  first  transmission  (DFT). 
Vith  this  access  rule,  newly  arrived  packets  are  treated  exactly  in 
he  same  wav  as  backlogged  packets.  Let  us  now  examine  what 
iaDpens  when  on  the  contrary  an  immediate  first  transmission 
I  FT)  rule  is  used,  that  is  when  new  packets  are  transmitted  with 
■reliability  one  in  the  slot  immediately  following  their  arrival.  It 
ias  been  moved  in  [8]  that  the  open-loop  throughput  .is  the  same 
or  both  first  transmission  rules.  The  closed-loon  throughput  on 
he  other  hand  depends  on  the  access  rule.  For  instance,  it  is  well 
mown  14]  that  for  the  usual  collision  channel  in  the  IFT  case,  the 
mumal  retransmission  probability  is  p*  =  Xo  -  Xi/Xo«  -  Xj, 
■iciding  an  optimal  throughput  X0ext/>«e'1,  in  contrast  to  the 
hroughput  jje  =  e~ 1  for  the  DFT  case.  In  the  multipacket  channel 
vuh  the  IFT  rule,  the  optimal  throughput  depends  not  only  on  the 
ncan  out  on  the  whole  distribution  of  new  packet  arrivals, 
nterestingly  enough,  it  can  be  proved  that  both  throughputs 
oinciae  when  the  new  packet  arrivals  are  Poisson  distributed. 
Till  with  the  same  method  as  in  the  proof  of  Theorem  1,  it  can  be 
asiiv  shown  that  there  exists  a  retransmission  probability  that 
lunimizes  the  drift  d ,  at  state  n.  With  such  a  retransmission 
■rooability,  the  system  with  IFT  rule  is  stable  for  X  <  supxl0 
(x)  and  unstable  for  X  >  supxl0  7(x),  with  7(x)  »  e~x  S"  . 
r/n\  0  where  we  have  defined  C0  =  0  for  nottfioaii 
onvemence  it  can  also  be  proved  that  a  control  of  the  form  pn  =» 
\/X.  yields  a  maximum  stable  throughput  T(A).  Since  supxl0 
(x)  depends  on  the  whole  new  packet  arrival  distribution 
Utti  this  result  is  not  as  conclusive  as  in  the  DFT  case.  This  is 
-ecause  the  stability  region  X  <  supxl0  T(x)  is  actually  given  in 
he  form  of  an  implicit  equation  in  X,  which  cannot  be  solved  in 
;enerai  without  further  specifications  on  the  distribution  (A„)„*o. 
or  instance,  this  stability  region  could  be  empty.  Consider,  for 
xampie,  the  usual  collision  channel  with  possibly  some  added 


noise 0  <  Ci  5  1  and C*  =  Ofor/t  2  2.  Then  T(x)  -  C,  e-*(X, 

+  \ox)  and  T'(x)  =  C[e~x(h$  —  Ai  -  X<>x).  Therefore,  for  any 
distribution  such  that  Xo  <  X(,  T(x)  is  maximum  at  7(0)  +  C[XS, 
ana  the  stability  region  is  empty  since  i  £  Xi  £  X.  Note  that 
in  this  sense,  the  immediate  first  transmission  does  not  perform  as 
well  as  the  delayed  first  transmission  with  which  the  system  can 
always  be  stabilized. 

If  there  are  solutions  to  X  <  supxl0  T(x),  then  the  best 
throughput  achievable  by  the  class  of  algorithms  in  (1)  is  vc  =  sup 
{A:A  <  supxl0  T(x)}.  This  is  what  happens,  for  instance,  when 
the  new  packet  arrivals  are  Poisson  distributed. 

Theorem  3:  If  the  new  packet  arrivals  are  Poisson  distributed, 
the  best  throughput  achievable  with  an  IFT  rule  is  the  same  as  in 
the  DFT  case,  ve  =  supxl0  t(x). 

Proof  of  Theorem  3:  If  !im,_„  C„  =  +  oo,  then  vc  =  vc  = 
+  cd.  Assume  now  that  C  <  +  ®.  We  get 

TM-e-™  t  %  t  S  C.** 

.-0  '  *«0  *! 

V  ^(x+A)'1.  (13) 

nt 

Thus,  in  this  case,  T(x)  depends  only  on  X,  and  to  clarify  the 
proof  below,  we  denote  it  by  Tx(x) 

Tx(x)  =  /(x  +  X).  (14) 

Assume  that  t{x)  does  not  achieve  its  supremum.  Then  from 
Property  5  in  the  proof  of  Theorem  2,  we  have  Vc  =  C  =  limx_„ 
t(x).  It  follows  from  (14)  that  for  any  X  >  0,  limx_<»  Tx(x)  =  C. 
Therefore,  for  all  X  >  0,  supxx0  Tx(x)  2  C.  Hence,  for  all  X  > 
0,  supx20  7x(x)  =  supxl0  t(x),  and  by  definition  of  »c,  we  finally 
get  yc  ss  supxl0  t(x).  Note  that  Tx  does  not  achieve  its  supremum, 
in  the  sense  that  if  there  existed  X  €  (0,  vc)  and  xx  2  0  such  that 
vc  =  T\(x0,  we  would  have  supxl0  t(x)  =  /(X  +  xx). 

Assume  now  that  t(x)  does  achieve  its  supremum.  there  exists 
Xo  2  0  such  that  supxl0  t(x)  =  t(xo).  Then  for  all  X  in  [0,  x0]: 
T\(Xo  -  X)  =  supxl0  t(x)  2  supxl0  Tx(x).  Thus,  for  all  X  €  (0, 

Xo] 

sup  rx(x)  =  sup/(x)  =  7x(Xo-X).  (15) 

izO  xzO 

We  have  for  all  x  2  0  t(x)  s  x,  therefore  supx*0  t(x)  s  Xo- 
Together  with  (15),  it  follows  that  for  all  X  €  (0,  supxi.0  t(x)),  X 
<  supx!-o  T\(x),  and  therefore  vc  2  supxi0  t(x)  -  rje.  Since  from 
(14)  supxl0  Tx(x)  s  supxl0  t(x)  =  Vc  for  all  X,  we  get  ve  s  Vc  and 
finally  vc  =  rfc=supXIo  t(x).  Note  that  from  (14),  7X  reaches  its 
supremum  too,  since  for  all  X  <  ve,  there  exists  xx  2  0  such  that 
Tx(xx)  =»  vc. 

Note  that  we  have  also  shown  in  this  proof  that  T(x)  reaches  its 
supremum  iff  t(x)  does,  which  means  that  Vc  can  be  achieved  with 
a  control  of  the  form  pn  =  A/Xn  iff  vc  can.  □ 

ID.  Optimal  Control  for  the  Multipacket  Channel 

It  is  assumed  from  now  on  that  the  users  do  not  have  access  to 
the  value  of  the  backlog,  so  the  problem  becomes  one  of  control  of 
the  Markov  chain  with  partial  state  information  provided  by  the 
channel  feedback.  We  build  a  backlog  estimate  S,  with  feedback 
which  is  such  that  Z,  =  0  if  slot  t  was  empty,  and  Z,  =0 
otherwise.  The  results  of  the  previous  section  strongly  suggest 
that  we  should  use  as  a  retransmission  probability  p,  =  A/S„ 
where  A  is  a  point  at  which  t(x)  achieves  its  supremum  (according 
to  Property  5,  A  is  assumed  to  be  finite).  We  show  that  the 
resulting  control  algorithm  achieves  the  optimal  maximum  stable 
throughput  vc  This  holds  provided  that  the  following  assumption 
on  the  reception  matrix  is  verified. 
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-  0:  There  exists  6  >  0  and  B  such  that  for  all  n  a  1,  ,  e,k 

it  B. 

"lie  ourpose  of  condition  CO  is  to  bound  the  probability  of 
laving  large  numbers  of  simultaneous  successes.  Unbounded 
mmoers  of  successes  per  slot  are  difficult  to  deal  with  because 
nev  may  result  in  very  large  instantaneous  errors  in  the  backlog 
sumate.  Note  that  condition  CO  is  likely  to  hold  in  most 
nuitiooint-to-point  channels  because  of  practical  limitations  on 
ne  receiver  caoabilities,  and  that  it  is  verified  for  all  the  examples 
n  Table  I. 

=heorem  4:  Assume  that  there  exists  A  6  (0,  4-  oo)  such  that 
lA)  =  supxso  *(*),  that  the  new  packet  arrivals  (A,),^0  are 
xoonentiai  type1,  and  thatcondition  CO  holds.  If  a  <  Oand/3  < 
i  verifv  the  following  two  conditions2: 

7:  <5  >  X 

-2:  8(1  -  e~A)  +  tjc  -  X  +  ae-'  =  0 
nen  the  control  algorithm  (cf.  the  control  laws  proposed  in  {15], 
19],  and  [25]) 

4 


Fig.  I.  Drift  properties  (Proposition  I). 


;,+i  =  max  (A,  S,  +  aI(Z,-0)  +  8I(Z,=0)} 

ias  maximum  stable  throughput  equal  to  ijc. 

-roof  of  Theorem  4:  The  proof  is  based  on  the  method 
■eveloned  in  [30].  The  idea  is  to  use  the  properties  of  the 
lOmogeneous  two-dimensional  vector  Markov  chain  of  the 
■acklog  and  its  estimate  M,  =  (X„  S,)  to  build  a  Lyapunov 
unction  whose  drift  is  negative  in  the  first  quadrant  of  the  (n,  s) 
■lane  when  X  <  vc.  It  turns  out  that  this  fails  to  hold  in  two  cones 
>f  the  state  soace,  but  it  can  be  proved  uiar  the  7-step  drift  of  the 
vapunov  function  is  negative  for  some  integer  7,  and  that  this  is 
nougn  to  ensure  that  M,  is  geometrically  ergodic.  It  follows  from 
heorem  2  that  M,  is  nonergodic  if  X  >  ije.  For  substantial 
■onions  of  the  proof,  the  reader  is  referred  to  [9]  because  of  space 
imitations. 

Jenote  bv  X,  =  S,  -  X,  the  error  in  the  backlog  estimate.  The 
irst  oart  of  the  proof  mainly  consists  of  computing  and 
■nproximating  the  drifts  of  X,  and  X,  which  are  the  basic  building 
■locks  for  the  Lvapunov  function. 

denote  bv  c(n,  s)  -  £[2f,+  (  -  X,\Mj  =  (n,  s)]  the  backlog 
irift  at  state  (n,  s),  and  by  d(n,  s)  =  £[^,+i  -  X,\M,  =  ( n ,  s)] 
ne  drift  of  the  backlog  error.  For  technical  reasons,  what  we  most 
uten  use  in  the  oroof  are  the  truncated  drifts,  which  correspond  to 
ne  value  of  the  drifts  restricted  to  those  Daths  where  the  variation 
n  the  backlog  is  bounded  by  some  integer  7,  that  is  c(n,  s,  7 )  = 
-  XmX,^  -  X,\  <s7)|Af,  =  (it,  s)]  and  d(n,  s,  7) 
*«*/♦!  -  X,)m,.x  -  X,\  sJ)\M,  =  (n,  s)].  Clearly, 
nese  truncated  dnfts  will  be  good  approximations  of  c(n,  s)  and 
if n,  s),  respectively,  when  7  is  large.  It  will  turn  out  that  the 
■rifts  deoend  primarily  on  the  ratio  x  =  n/s  for  large  values  of  n 
■r  s.  rhus.  it  is  convenient  to  define  the  following  two  regions  in 
ne  in,  s)  plane: 


-fXo.  {(n.  s) :  mO,  saO,  I  +  XoS-s  1  +X(} 


Ar=  {(n,  s ) :  mM  or  saAf} 

vnere  Xo  and  are  such  that  -  oosXoSXts  +«.  The  aim 
u  the  first  part  of  the  proof  is  to  show  Proposition  l  below  which 
:ummanzes  ail  the  properties  of  the  drifts  that  are  needed  for  our 
■urposes  tsee  Fig.  1). 

t.  is  exponential  type  if  there  exists  d  >  0  such  thit  fie*1']  is  finite.  For 
nstance.  Utis  is  irue  if  A,  is  Poisson  distributed. 

•"ondilionsCl  and  C2  define  half  a  straight  line  in  the  plane,  and  therefore 
■n  infinite  number  of  possible  estimation  schemes,  all  of  them  yielding  the 
ame  throughput. 


Proposition  1:  There  exist  y  6  (0, 1  /5),  5  >  0,  and  an  integer 
>  0  such  that  for  all  7  2:  70: 

t)  for  all  (n,  s)  6  C(  -  5y,  5y)  fl  UM,  c(n,s)  s  -6  and  c(n, 
s,  J)  £  -5  +  v(J)\ 

ii)  for  all  (n,  s)  6  C (  -  oo,  -7)  D  UM,  d(n,  s)  £  -5  and 
d(n,  s,  7)  £  -5  +  v(J)\ 

til)  for  all  ( n ,  s)  6  C (7,  4-  <»)  fl  UM,  d(n,  s)  >  5  and  d(n,  s, 
j)  a  5  -  v(J) 

where  v(J )  is  a  nonnegative  function  which  goes  to  zero  a  7  goes 
to  infinity. 

The  detailed  proof  of  Proposition  1  can  be  found  in  [9].  After 
computing  the  value  of  the  drifts 


c(0,  s )  =  X 


f  16a) 


c(n  rf)  =  X~2 !  (l-j)  '  C,  (n>  1)  (16b) 


d( 0,  r)  =  max  {A  -s,  a}  -X 

/I  V 


(17a) 


/  A  \" 

,  J)  =  /3-X4-(max  [A  -s,  a}-0)  ^1  J 

<,7b> 


we  work  out  upper  and  lower  bounds  by  truncating  the  sums  (16) 
and  (17)  to  a  fixed  number  of  terms,  and  then  we  approximate 
those  bounds  as  a  function  of  the  sole  variable  n/s.  The  main  idea 
is  that  the  dynamic  behavior  of  the  Markov  vector  M,  =  (X,,  S ,) 
depends  essentially  on  the  ratio  X,/S,.  For  instance,  if  x  is  nearly 
'dual  to  1 ,  the  backlog  estimate  is  close  to  its  ideal  value,  and  we 
snould  have  c(n,  s)  <  0  since  the  backlog  drift  is  negative  in  the 
perfect  state  information  case.  Also,  a  well-behaved  estimate 
should  be  such  that  if  x  <  1.  then  the  error  s  -  n  is  positive,  and 
Jierefore  should  have  a  negative  drift  d(n,  s)  <  0  (see  [15]).  In 
the  same  wav,  we  expect  to  have  d(n,  s)  >  0  for  x  >  I. 
uCt  us  define  the  following  Lyapunov  function: 


(  1+3-y  1-3v 

V(n,  s)  =  max  { n,  — —  (n-s),  — ~  (s 
^  3y  3y 


->] 


vnere  the  constants  have  been  chosen  so  that  V  is  continuous. 
K(n,  s )  is  equal  to  the  first,  second,  and  third  term  inside  the 
bracket  when  (n,  s)  is  in  C(  -  3y,  37),  C  (J7,  4-  00),  and  C(  -  00, 
-  3y),  respectively.  Notice  that  V is  defined  so  as  to  take  the  best 
advantage  of  the  drift  properties  listed  in  Proposition  1.  For 
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instance,  when  V(j\ ,  s)  is  equal  to  n,  then  the  Markov  chain  M, 
belongs  to  3-y)  which  is  included  in  C(-  5-y,  S-y)  where 

the  backlog  drift  is  negative  provided  that  either  n  or  s  is 
sufficiently  large.  Similar  comments  can  be  made  about  the  other 
two  regions.  Unfortunately,  this  does  not  enable  us  to  conclude 
that  the  drift  of  the  Lyapunov  function  is  negative  in  UM  because 
A/,»i  may  well  be  in  a  different  region  than  M,.  However,  this 
change  of  region  becomes  unlikely  if  we  exclude  a  small  zone 
around  the  lines  x  =  1  ±  3-y  where  V  changes  definition  and 
indeed  the  second  part  of  this  proof  consists  of  showing  that  the 
Lyapunov  function  has  a  negative  drift  in  the  remainder  of  the 
state  space. 

Proposition  2:  There  exist  A/0  >  0  and  &,  >  0  such  that  for  all 
N  >  Mo  and  for  all  (n,  s)  G  UN  IT  (C(-oo,  -4-y)  U  C(- 2y, 
2y)  U  C(4y,  «)], 

E[V(MI+I)-  V(M,)\M,  =  (,i,  s)] <  -So- 


If  M,  belongs  to  the  other  two  regions,  C(4y,  oo)  n  U&J)  or 
C(— oo,  -47)  fl  UQ{J),  a  similar  argument  holds,  using 
Proposition  1  iii)  and  ii),  respectively,  along  with  (20)  and  (21).  It 
follows  that  for  all  7  >  /0  and  for  all  (n,  s)  G  0  {C(  -  *, 
-47)  U  C(-2y,  27)  U  C(47,  00)) 

E[(V(Ml+i)~  V(M,)) /( M, -  S, |s 7)| M,  =  («,  s)l s  - S,  +  v, (7) 

(22) 

with  6|  =  min  {1,1-  37/37)6  and  17(7)  =  v(J)  1  +  37/  37. 

To  deal  with  the  second  term  on  the  right-hand  side  of  (18),  we 
consider  the  further  decomposition 

^RP(A/,+  l)-K(A/,))/(M(-S,|>7)|A/,=(fl,5)l 

=  £[(P(A/„.,)-  y(Mi))I(\Ai>'Si+J)\M,  =  (n,  5)) 


Proof  of  Proposition  2:  We  consider  separately  likely  and 
unlikely  events 

E[V(M,.x)-V(M,)\M,  =  (.n,  s)) 

=  E[(V(M,.X)-  P(Af,))/(|A,-S,|s7)|A/,=(n,  5)] 

+  £[(P(A/,»,)-  F(Af,))/(M,-S,|>7)|A/,  =  (n,  5)].  (18) 

We  start  by  showing  that  the  first  term,  which  corresponds  to 
likely  events,  is  negative  when  7  is  large  by  using  the  properties  of 
the  truncated  drifts  from  Proposition  1  and  a  simple  geometric 
result.  The  lemma  below,  whose  proof  is  in  (9],  gives  a  measure 
of  how  much  a  cone  C(X 0.  X,)  expands  if  each  of  its  points  is 
allowed  to  move  of  some  distance  that  cannot  exceed  B  in  absolute 
value  along  each  axL 

Lemma:  Consider  -v  >  <3,  B  >  0,  and  7  -  1  <  V  <  X,  < 
+  00;  and  assume  that  jn  -  fl '  |  S  B,  |s  -  s’  |  £  B,  and  Q  a  Bt 
7  0  +  |X||)(X,  +  2  +  7).  Then: 

1)  (a,  5)  G  C(X o,  00)  n  UQ 


+  £[(P(A/,+  l)-  P(A/, ))/(£, >/l, +  7)| A/, =  (n,  5)).  (23) 

Let  us  denote  by  Tx(n,  s,  J)  and  r2(n,  5,  7)  the  two  terms  on  the 
right-hand  side  of  (23).  The  first  term  Tx(n,  5,  7 )  corresponds  to 
a  case  where  the  variation  in  the  backlog  is  bounded  below,  and 
can  be  shown  to  vanish  as  7  increases  by  using  the  sole  fact  that 
the  mean  arrival  rate  X  is  finite.  Consider  now  r2(n,  5,  7 ).  If  M, 
=  (n,  5)  belongs  to  a  region  such  that  x  =  n/s  >  Xo,  then  *0  can 
be  chosen  large  enough  so  that  if  A/,*  1  belongs  to  C(  -  00,  -  37), 
then  the  error  in  the  backlog  estimate  which  results  from  the  large 
number  of  successes  just  compensates  the  initial  error  n  -  s  ►  0. 
On  the  other  hand,  when  M,  belongs  to  any  region  such  that  x  is 
bounded  above,  then  E[2,I(S,  >  J)\M,  =  (n,  5)]  goes  to  zero 
uniformly  in  (fl,  5)  and  T2(n,  5,  7 )  can  be  dealt  with  by  using  the 
following  rather  crude  bound  for  the  variation  of  V: 

1-37] 

37  j 

•  (|a|+0  +  M,-2,|)s£(l+|/l,-S,|)  (24) 


|  K(A/,+  |)-  F(A/,)|smax 


=*  (n',5')  €  C(Xo~7,  °°)  H  Uq.b 


where  R  is  some  positive  constant.  It  is  shown  in  [9]  that 


2)  (n,  5)  G  C(- 00,  X,)  fl  Uq 

*»  (a',  5')  g  C(- 00,  X1+7)  n  Uq-b 

3)  (n,  5)  e  C(X 0,  X,)  n  UQ 

-  (fl',5')  €  C(Xo— 7,  X,  +7)  n  UQ.B. 

bet  B(J)  -  max  {7,  |a|  +  0}.  Jiid  define  Q(J)  to  be  any  real 
such  that  Q(J)  2.  max  {£(7)  t-  M,  B(J)/y  (1  +  47)  (2  + 
3y)}.  We  have  |S,.|  -  S,|  £  |o|  +  &  s  £(/),  and  if  \A,  -  E,| 
S  7.  then  |X,*,  -  X,\  s  7  S  13(7).  From  the  lemma,  Q(J)  is 
such  that 


EUVWn)- y(M,))I(\Ai-2,\>J)\M,=(nt  5)1 

sv2(7)+<y(fl,  5)  (25) 

where  limy-.  p2(7)  =  0,  and  tj(n,  5)  is  a  nonnegative  function 
that  depends  on  7,  and  goes  to  zero  as  either  n  or  s  goes  to  infinity . 

By  using  (22),  (25),  and  the  decomposition  (18),  we  get  the 
desired  result  that  the  drift  of  V  is  negative  in  this  part  of  the  state 
space:  fix  an  integer  7^,  such  that  7™,  2  70  and  that  for  all  7  2 
7rw«.  vi(7)  +  v2(7)  <i  5|/3.  Then  from  (22)  and  (25),  we  have 
for  all  (fl,  5)  €  UWmul  n  [C(-oe,  -47)  U  C(- 2y,  2y)  U 

C  (4t»  »)). 


m,  e  C(-27, 27)  n  uQiJ)  -  a/,*,  g  c(-37,  37)  n  u„ 

(19) 

M,  G  C(47,  00)  n  UQU)  -  Mttr  1  G  C(37,  08)  n  UM  (20) 
m,  e  c(-»,  -47)  n  uQKJ)  -  m,„  €  C(-»,  -37)  n  u„ 

(21) 

where  M  has  been  defined  in  Proposition  1  Assume,  for  instance 
that  M,  belongs  to  C(-27,  27)  fl  UW).  From  (19),  Af,,j  G 
C(  -  37,  37)  (U/«nC(-  57,  57)  n  U».  Hence,  if  7  a  70,  we 
can  apply  Proposition  1  i): 

£[(  V(Ml.l)-y(Mly)H\A,-Z,\<iJ)\Ml=(n,s)) 

■-in,  5,  J)s  -6  (■*(/). 


E[V(M,.i)-  K(Af,)|Af,»(«,s)]s  —  1 6|  +eymill(n  p>). 

Then  we  can  choose  an  Af0  >  (?(/m m)  which  is  large  enough  so 
that  (jmfn,  s)  <  6|/3  for  all  (fl,  s)  in  t/.v0.  □ 

This  concludes  the  second  part  of  the  proof.  Unfortunately,  it  is 
not  always  true  that  the  drift  of  V  is  negative  outside  a  finite  subset 
of  the  state  space.  For  instance,  we  have  proved  that  in  the  case  of 
the  usual  collision  channel  with  Poisson  new  packet  arrivals,  there 
exist  constants  £„  >  0  and  Ma  such  that  for  all  (fl,  5)  G  UM  for 
which  *  =  1  ±  3y,  and  for  all  a  and  0  verifying  Cl  ana  C2, 
E[V(M,t{)  -  V(M,)\M,  =  (fl, 5))  >  Ba.  However,  discontinui¬ 
ties  around  the  lines  x  =  1  ±  37  cancel  out  when  one  waits  long 
enough,  and  in  the  last  part  of  this  proof  we  show  that  the  7-step 
drift  of  V,  E[V(Mi+j)  -  V(M,)\M,  =  (n,  5))  is  negative  for 
some  integer  7. 

Proposition  3:  There  exist  Jr  >  0,  p  >  0,  and  Mf  >  0  such 
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that  for  all  (/»,  s)  €  VMf 

E[V(M,+Jf)- V(M,)\M,=(n,  s)]<;  -p. 

Proof  of  Proposition:  One  of  the  main  problems  in  dealing 
with  the  /-step  drift  of  V  is  to  control  the  changes  of  regions 
between  M,  and  Mt*j.  To  this  end,  we  define  the  stopping  time 

ry=min  js2:0,  £  (/!,**-£,**)  >/Jj  . 
v.  |*»o  I  J 

If  tj  >  /,  then  for  1  s  k  <  /,  \X,*k  -  X,\  <.  f*  and  | S,+k  -  S,| 
£  /(|a|  +  /3).  Thus,  if  we  define  B'{J)  =  max  {/(|a|  +  &), 
J1},  and  Q'{J)  to  be  any  integer  such  that  Q'{J)  >  B'(J)  + 
max  {A/0,  M)  and  Q'(J)  >  2B'(J)/y(\  +  9/27)(57  +  2), 
then,  still  assuming  that  tj  a  /,  we  get  from  the  lemma  for  0  £  k 
<,  / 

M,  e  C^-oo,  D  UQ,U) 

-  M„k  e  C(-«,  -47)  0  U„0  (26) 

M,e  cf-27+|,27-^  n  uQ.U) 

-  M,.k  €  C(-2y,2y)  D  U„0  (27) 

m,  e  c  ^47 +2 ,  oo^  n  Uq,U) 

-  A Y,+Jt  €  C(47,  co)  n  UMa  (28) 

M,  e  47-j.  - 27  +  2^  n 

-  M,+k  €  C(-57.  -7)  D  l/*  (29) 

A/,e  c^27-|,47+^  n  uQ.{J) 

-  A/,+Jt  €  C(7,  57)  n  f/*,.  (30) 

In  other  words,  we  have  partitioned  the  plane  into  two  zones 


Then  we  have  chosen  Q'{J )  such  that  if  AY,  belongs  to  ZN  which 
is  slightly  smaller  than  the  region  in  which  the  drift  of  the 
Lyapunov  function  is  negative,  and  if  r,  <>  /,  then  the  Markov 
chain  remains  in  the  region  in  which  Proposition  2  applies  up  to 
time  t  +  /(see  (26)-(28)  and  Fig.  2).  Q'{J)  is  also  such  that  if 
M,  is  in  ZP  and  if  ry  a  /,  then  up  to  time  r  +  /the  chain  stays  in  a 
region  such  that  two  out  of  the  three  properties  of  Proposition  1 
hold  at  each  step  (see  (29),  (30),  and  Fig.  3). 

We  start  by  showing  that  the  /-step  drift  of  V  is  negative  at  (n, 
s )  wb-iti  (n,  s)  belongs  to  Zs.  We  decompose  the  /-step  drift  of  V 


Fig.  2.  KM,  6  ZN  O  t/o  w)  and  if  t,  a  /,  then  A/,.,  belongs  to  the  region 
where  the  drift  of  V  is  negative. 


S 


Fig.  3.  If  M,  €  Zp  fl  and  if  r,  £  /,  then  M,.,  belongs  to  a  region 
where  two  properties  of  Proposition  1  hold. 

as  follows: 

j- 1 

=  £  £(£[P(AY„**,)-F(Af,**)|Af,**] 

*-0 

J-l 

■  I(rjizJ)\Ml  =  (n,s))+'2  E[E[V(Si,>k.x) 

4r- 0 

-  V{M,.k)\M,.k)I(T,<J)\M,  =  (n,  s)\.  (31) 

Denote  by  (/,(/,  n,  s)  and  f/2(/,  n,  s)  the  two  sums  on  the  right- 
hand  side  of  (31).  If  Tj  a  /,  ihen(26)-(28)  hold,  and  therefore  we 
can  apply  Proposition  2 

Ui(J,  n,  s)z  -J6oP[rj^J\M,=(n,  s)J.  (32) 
Let  us  now  show  that  ij  <  J  is  indeed  an  unlikely  event,  the 
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■rooability  of  which  goes  to  zero  as  1/7  uniformly  in  (n,  s) 


’ Wj<J\M,=(n,s )] 


2  SfW) 


>73|M,  =  (n,  s) 


-II  * 

>v  ^1  2s'-/>y3i  M>=(n<s) 

-o  u/»o 


rom  Markov’s  inequality  wc  have 


■Wj<J\MMn,  s)\^  ^  <*+1>x 
“0 

-2  2£is/+'iA/'=<'1-*))- 
•  0/*0 

Jenoting  by  5,  an  upper  bound  on  the  sequence  tn(p*),  it  follows 
rom  Section  H  that  £TZ,^|Af,  =  (n,  s)]  =  E[E\t.,k.i\M,i.x]\M, 
■  n,  s)]  £  £„  so  we  get 


:,\TJ<J\M,  =  (n,  (33> 

vnere  B.  is  some  positive  constant.  From  (24),  it  is  easy  to  check 
nat  the  drift  of  V  is  bounded  bv  some  positive  constant  By,  so  that 

M  7,  n,  s)£JBvPlTj<J\M,  =  (n,  j)J.  (34) 


onsidering  (31),  (32),  (33),  and  (34),  we  get 


sTKOMiw)-  j)]S -8oJ+(By+6o)Bt. 

Tierefore.  there  exist  constants  p,  >  0  and  7,  >  0  such  that  for 
;il  7  2:  J,  and  for  all  ( n ,  s)  6  U0  U)  fl  ZN, 

V(M,)\M,  =  (n,  s)l<;  -Jpt.  (35) 

Ve  now  nroceed  to  show  that  the  7-stcp  drift  of  the  Lyapunov 
unction  is  negative  in  the  remaining  part  of  the  state  space  ZP 
onsisting  of  the  two  cones  around  jc  =  1  db  3-y.  This  is  done  in 
wo  sicds.  We  first  show  that  the  7-step  drift  of  V  restricted  to 
ikelv  events  {tj  2:  7}  goes  to  -  os  as  7  increases,  and  then  we 
>rovc  mat  the  7-step  drift  of  V  restricted  to  unlikely  events  {r,  < 

.  \  is  bounded  above  independent  of  7. 

■  ssume.  for  instance,  that  (n,  s)  E  C(y  -  y/2,  4y  +  y/2)  D 
,j ,  The  difficulty  here  is  that  V  can  take  two  possible  values, 
ma  therefore  Proposition  1  cannot  be  used  directly.  If  tj  2  7, 
■ten  from  (30)  M,tk  E  C (y,  5y)  D  UM  for  0  £  Ar  <;  7,  so  that 
'(M,+k)  =  max  {X,.k,  (1  +  3y)/3y(,Xti.k  -  S,»*)}.  There- 
dre. 


7(  V(  Mw)-  n Mi)) Htj*J)\ M,  =  (n,s)] 
1  max  (*,w-S/+,)j 


-  fryS:7)|  A/,  =  (n,  s) 


1  max 


J 


•(rya7)|Af,  =  (/t,  s)  | 

J 

£  lmax  ( -X,*j+X,) 

■( Tj-Z.J)\M,-(n ,  s)  | 


sir.ee  max  {a,  b)  -  max  {c,  d}  s  max  {a  -  c,  b  -  d).  Then 
istng  the  fact  that  max  {a,  b)  s  max  {0,  a  +  /}  +  max  (0,  b  + 
f)  -  /for/  2  0,  we  get 


£[(  y(M^j)-  V(M,))I(Tj-£j)\M,=(n ,  s)) 
|^max  ^0,  X ',+y  —  A',  +  5, 

•  I(TjZJ)\M,=(n,  s)J 

+  £  j  max  (O,  (-^(+y  +  ^,)  +  5, 

•  I(Tj£J)\M,=(n,  s)l 

Z  ^5,  ^/(ry2  7)|A/,  =  (n,  s)j 


(36) 


vhere  6,  =  min  {1,  (l-3y)/3y}  has  been  defined  in  (22).  We 
show  that  the  first  two  terms  on  the  right-hand  side  of  (36)  are 
bounded.  Since  (33)  limy_„  -  5|7/2Plry  a  7)  =  -  oo,  this  will 
be  sufficient  to  prove  that  limy_„  £](  -  V(M,))1(tj  2 

7)|M,  =  (fl, s)]  =  -oo.  Define  Wk  -  X,+k  -  X,  +  Ary, /2 and 
.=*  =  £,**;  where  F,  is  the  sigma-field  generated  by  {A„  s  <,  l  - 
1 :  X„  s  £  (} ,  representing  the  history  of  the  process  (A/,)(io  up  to 
time  t.  To  prove  that  the  first  term  in  (36)  is  bounded,  we  show 
Jrat  there  exists  <t>  >  0  such  that  (Yk,  Fk)  is  a  supermartingale, 
with  Yk  -  e*w*I(Tj  2  Ar).  We  need  to  show  that  £1  Yk  * ,  |/*]  <, 
Yk,  which  is  equivalent  to 

T[e*<Ar,+*+,-A-,+(*+i/2)«,)/(TyaA.+  ,)|F(+Jt] 


since /(ry  2:  k  +  1)  =  I(tj  2:  k)I(Tj  2A+  1),  and  I(tj  >  Ar)  ts 
measurable  with  respect  to  £,+* 

/(ry2;Ar)£[e*<Afi+*+i-A'/  +  *+4i/2>|£,+*]s/(TyaAr).  (37) 

Now  if  Tj  2:  Ar,  then  from  (30),  Af,+*  E  C  (y,  5y)  D  t/M.  Lemma 
2.2  in  [11]  states  that  if  A"  is  a  random  variable  such  that  |  A|  is 
stochastically  dominated  by  an  exponential  type  random  variable 
X  and  if  the  expectation  of  X  is  strictly  negative,  £|.V|  <  - « , 
then  u.crc  exist  two  constants  ij  >  0  and  p  <  1  such  that  E[e"x\ 
<  p  <  1.  Hence,  there  exists  <f>  >  0  such  that 

.dr  all  (n,  j)  E  C(-5y,  5y)  0  UM, 

+1- | A/,  =  (n,  s)]<  1  (38a) 


for  all  (/j,  s)  E  C(  -  oo,  -  y)  D  UM, 

E[e«xi+ 1  *  *i+ 4/2>| M,  =  (n,  s)]  <  1  (38b) 


•or  all  («,  s)  E  C(y,  oo)  n  UM, 

E[e*-*i+ 1  +-?/+4/J)|  A/,= (n,  s)l  <  1 .  (38c) 

it  follows  from  (37)  and  (38a)  that  (T*,  Fk)  is  a  supermartingale. 
Therefore, 

£[  T/IFol «EPw*HTi±J)\ F,)£El  Y0\F0)~\.  (39) 

Finally,  considering  that  max  {0,  x}  £  l/d>  e*\  it  follows  from 
(39)  that  the  first  term  in  (36)  is  bounded.  Using  (30)  and  (38c).  it 
can  be  shown  with  the  same  method  that  the  second  term  in  (36)  is 
also  bounded.  Thus,  threre  exists  a  constant  BT  independent  of  7 
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ucn  that 

WMw-  K(A/,))/(ry2/)|M,=(/i.  s)]**,-—  $,P[ry2/]. 

He  case  In,  s)  6  C(-4>  -  -y/2,  -2y  +  7/2)  H  Uq'{])  can  be 
icait  with  in  a  similar  way,  using  (38a)  and  (38b).  Therefore,  we 
iave  shown  that  there  exist  mi  >  0  and  /2  >  0  such  that  for  all  / 
Ji  and  for  all  (n,  s)  £  Yo‘V)  H  ZP 

KV(M,+j)~  ViM,))HTj*J)\M,=in,  s)]<  -/»•  (40) 


Theorem  5:  Suopose  that: 

)  the  number  of  new  packet  arrivals  per  slot  has  finite  second 
moment  £I<4?]  <  +  00; 

ii)  there  exists  A  £  (0,  +  00)  such  that  t(A)  -  suprEo  f(*); 
ii)  CO':  there  exists  B  <  +»  such  that  for  all  n  2  1, 

k^nk  *  B. 

Fix  X  <  i;c  and  f  >  0  such  that  X  <  r(-4£).  Choose  a  <  0  and 
/3  >  0  such  that 

wl':  J3(e^«  — 1)=— 


t  is  shown  in  19]  that  there  exist  a  constant  B  >  0,  a  function 
'1  (J)  with  limy-®  t>3 (J),  and  a  nonnegative  function  vj(M) 
lenending  on  J  verifying  limw-«  vj(M)  -  0,  such  that  for  all  (n, 
-J  €  Uo-{j)+ui  Cl  ZP, 


Cl:  /3>m{(X)  =  sup 

x>Q,x*i 


\-t(Ax)~xe~AU'() 


>-'(AQ 

6 


_rp-A{e- £) 


x-xe 


V(\f,))I(Tj<J)] M,=(n,  s)} 


rhen  the  control  algorithm 


B+  v\{J)  +  v/(M\).  (41) 

Ve  are  now  ready  to  conclude  the  proof  of  Proposition  3.  From 
40)  and  (41),  we  have  for  all  ( n ,  s)  £  U0-u)+m,  fi  Zp, 
=  (/i.  s)]  S  B  -  Jn j  +  »,(J)  + 
•Wi).  Fix  an  integer//  2  max  {J\,  f)  such  that  for  all  /  2  Jf, 

:  -  Jai  +  W)(J)  <  -pi .  Then  for  all  (n,  s)  £  Uq-{]. )  +  W|  O 
■»,  we  have  E[V(M,rJ,)  -  V(M,)\M,  =  (n,  5)]  £  -Hi  + 
v,tAf|).  On  the  other  hand,  we  also  have  from  (44),  for  all  (a,  s) 

-  f)*M\  O  Zp 

r V(M,)\ H  =  (a,  *)]£  -Ml//. 

4ow  fix  M,  large  enough  so  that  vj(M,)  £  mj/2.  Then  define  Mr 
<V(J,)  +  Mi,  and  p  =  min  {mj/2,  Jjfi ■ }.  O 

Ve  can  now  conclude  that  (A/,),* 0  is  geometrically  ergodic  for 
x  <  nc  by  invoking  the  following  result. 

heorem  (Hajek  [11]):  Let  \W,}  be  a  sequence  of  random 
'anables  adapted  to  an  increasing  family  of  a-fields  {F,}. 
:uopose  that  W0  is  deterministic,  that  {W,,  F,}  is  exponential 
vpe,  and  that  for  some  <>0anda>0we  have  £[(  IV, + 1  -  W, 
-)  1(W,  >  a)  |F,]  £  0  for  all  t  2  0.  Then  for  each  value  of 
Vn  the  stopping  time  r  =  min  {/  2  0;  W,  <,  <7}  is  exponential 
vpe. 

Tenne  W,  =  V(Mt],)  and  a  =  Afytnax  {1,(1  +  3-y)/3*y»  (1  - 
-v)/3-y} .  If  V(M,)  >'  a,  then  M,  £  UMf.  From  (24)  and  CO 
V(M,),  F,)  is  exponential  type  since  A,  is.  From  Proposition  3, 
ve  can  aopiy  Hajek’s  result  to  our  system  to  conclude  that  r  =* 
run  it  2  0,  V(M,Jf)  £  0}  is  exponential  type  for  any  initial 
tate.  since  V(M,)  £  a  implies  that  X,  £  a  and  S,  £  a/(l  - 
y),  it  follows  that  r'  =  min  {/  2  0,  Xu,  £  a,  and  £  a/(  1 
W)}  is  also  exponential  type  for  any  initial  state,  as  well  as  t* 
min  if  2  0,  X,  £  a,  and  S,  £  a/ (l  -  3y)}.  Hence,  it  follows 
rom  114]  that  (X„  S ,)  is  geometrically  ergodic,  concluding  the 
•root  of  Theorem  4.  □ 


A 


S(+l=max  {A,  S,  +  er/(Z,  =  O)  +  /3/(Z,  =  0)} 

is  stable. 

Proof  of  Theorem  5:  Let  us  state  first  Mikhailov’s  Theorem 
(cf.  [35]  for  an  exposition  of  this  result  and  its  application  in  the 
decentralized  control  of  the  conventional  collision  channel). 

Theorem  (Mikhailov  [19]):  Let  Af,  =  (X„  S,)  be  a  homogene- 
ms  Markov  process  on  R*  X  R*  with  drifts 

c(n,  s),  e(n,  s))=ElM,^-M,\M,-(n,  *)]. 

Suppose  that: 

i)  there  exists  B  <  +  00  such  that  for  all  (n,  s)  £  R*  x  R* 
£l||M,+1  -  M,\V\M,  =  (a,  s))  £  B-, 

ii)  for  all  ^  G  (0,  +w),  the  drifts  (c(n,  n/$),  e(n,  n/]r)) 
converge  uniformly  in  ^  as  n  goes  to  infinity  to  (c{\p),  e(\p)); 

iii)  the  limit  drifts  (c(^),  e(f))  are  differentiable  on  [0,  +  0°), 
with  (<?(0),  e(0))  =  lim,_®  (c(0,  s),  e(0,  s)); 

iv)  there  exists  t  >  0  such  that  if  c(^0)  =  io  e(^o).  then  c(0o) 

< 

Then  M,  is  stable. 

Since  both  the  new  packet  arrivals  and  the  rows  of  the  reception 
matrix  have  finite  variance,  it  is  easy  to  check  that  condition  i)  in 
Mikhailov’s  Theorem  holds 

5[i|A/,+  l-Af(ilJ|A/(=(n,s)l 

=  £[( X,.  1  -X,)2  +  (S,. ,  - S,) *| M,  =  (n,  s)). 

Now  £I(S,*|  -  S,)2|A/,  =  (/»,  s)]  £  a1  +  j?1,  and  from  (2) 

El(Xltl-X,)2IM,=(n,  s))£ElA}}+EV:}lMl*(n,  s)). 
^rom  CO'  the  variance  of  the  number  of  successes  is  also  bounded 


V.  Stability  Proop  Via  Mdchailov’s  Theorem 

dikhailov  119,  Theorem  3]  has  recently  found  a  powerful 
urficient  condition  to  guarantee  the  stability  of  a  Markov  process 
aKing  values  on  R*  x  R* .  This  result  can  be  used  to  weaken  the 
uiticient  conditions  we  imposed  in  Section  ID  and  obtain  a  much 
uore  simple  pioof  of  stability.  However,  the  form  of  stability 
>seo  bv  Mikhailov  is  weaker  than  the  geometric  ergodic  ity  used  in 
action  ID. 

et  M,  be  a  discrete-time  Markov  process  taking  values  in  Y  S 
i".  U(r)  =  {x  £  £  r},  and  r,(S)  =  min  {t  2  0:M,  £ 

-W0  =  *},  i.e„  r*(S)  is  the  time  it  takes  to  reach  the  set  S  from 
:.  t  hen  we  sav  that  the  process  M,  is  stable  if  there  exist  constants 
.  ana  Ci  such  that  £(T,((/(r))]  £  Ci  ||jc||  +  c2  for  all  x  £  Y. 
Jsing  this  definition  of  stability  we  show  the  following  result 
vnich  is  analogous  to  Theorem  4. 


TS2|A/,-(n,s)] 


t,k£B. 


It  follows  directly  from  (16)  and  (17)  that  the  limit  drifts  are 
given  by 


c(4>)**\-t(A4r) 
e(^)  =  0  +  (ar-£)*''4*, 

respectively,  for  ^  6  [0,  +  00).  Uniform  convergence  to  the  limit 
drifts  follows  immediately  from  the  results  given  for  the  perfect 
state  information  case  (Property  4).  Also  it  is  clear  the  t(x)  is 
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differentiable  (see  (6),  where  OsC,s  n).  Therefore,  properties 
ii)  and  iii)  in  Mikhailov’s  Theorem  are  satisfied. 

In  order  to  check  property  iv)  note  that  if  tAo  =  I.  then  it 
follows  from  Cl '  that 

c('l'o)  =  4'oe('Po). 


1)  One-Dimensional  Kaplan’s  Condition:  Consider  the 
model  of  Section  D  with  a  control  scheme  p„  =  F(X„),  and  the 
Lyapunov  function  V(x)  =  x.  To  check  Kaplan’s  condition,  it  is 
enough  from  (271  to  show  that  the  downward  part  of  the  drift 
-  D(i )  =  2*,,  kP,j.k  is  bounded  below.  For  /  ^  1  and  1  s  k 
<,  i  we  have 


But,  at  that  point,  c(^0)  <  0  because  of  the  choice  of  {.  There  is 
no  other  root  of  the  equation  c(^)  =  ipe(ip),  and,  therefore, 
property  v)  follows.  To  see  this,  note  that  because  of  Cl ',  c(^)  = 
for  41  *  £  is  equivalent  to 


0  = 


±  j 

l-gAlt-f) 


which  is  impossible  if  ^  *  £  because  of  C2'.  □ 

It  can  be  shown  [9]  that  /w{(\)  is  finite  for  all  nonnegative  X  and 
£,  and  therefore  the  set  of  control  laws  defined  by  Cl'  and  C2'  is 
nonempty.  Actually,  the  set  of  control  laws  in  Theorem  4  is  a 
subset  of  those  in  Theorem  5  because  in  Theorem  5  we  can  choose 
£  »  1,  in  which  case  C2  is  equivalent  to  Cl'  and  Cl  is  more 
restrictive  than  C2'  because  X  a  m{(\)  (9).  □ 


V.  Conclusion 


P.,.-*=2X"  2  (  '■)m'(l-/7(i)),-V+*- 

/t»0  j-k  +  n  V / 

After  a  change  of  variable,  it  follows  that 

0(0  =  2  (j)  F(iY(l-F(i)Y-'  2  2  (*”«)*/.*■ 

Jm 1  '  '  n~ 0 

(A-l) 

If  (C,),j.|  is  bounded,  then  Kaplan’s  condition  holds  independent 
of  the  retransmission  policy.  Denoting  by  Bc  an  upper  bound  for 
(Cfllniii  (A-l)  becomes 

-aw*  -  2  (;)  Foyo  -?(>))"'  2  x*c. 


In  this  paper  we  have  investigated  the  properties  of  decentral¬ 
ized  control  algorithms  for  a  random  access  channel  with 
multipacket  reception  capability.  By  using  the  working  hypothesis 
that  the  users  arc  aware  of  the  value  of  the  backlog,  we  have 
determined  the  best  throughput  achievable  by  any  such  protocol, 
as  well  as  a  simple  way  to  achieve  it.  The  optimum  throughput  has 
been  shown  to  be  given  by  the  maximum  average  number  of 
successes  per  slot  when  the  number  of  transmissions,  per  slot  is 
Poisson  distributed.  In  the  imperfect  state  information  case,  we 
have  shown  that  the  same  throughput  achieved  in  the  perfect  state 
information  case  can  be  achieved  by  using  in  lieu  of  the  true 
backlog,  an  estimate  of  the  backlog  computed  at  each  station  using 
binary  feedback,  and  we  have  used  this  estimate  to  derive  a 
control  scheme  which  is  optimal  in  the  sense  that  it  achieves  the 
optimal  throughput  determined  earlier.  This  is  true  provided  the 
reception  matrix  verifies  condition  CO.  which  puts  some  restric¬ 
tions  on  the  number  of  successes  per  slot.  By  using  Mikhailov’s 
result,  CO  can  be  replaced  by  the  weaker  condition  CO'.  In  this 
case  however,  geometric  ergodicity  was  not  ensured.  Note  that 
the  feedback  empty/nonempty  used  in  Sections  in  and  IV  may  be 
less  than  the  available  feedback  in  many  practical  situations,  but 
no  further  information  is  needed:  a  ternary  feedback  would  not 
shorten  the  proof  or  achieve  better  throughput. 

Finally,  let  us  mention  that  one  can  easily  modify  the  proof  of 
Theorem  4  to  show  that  a  similar  result  holds  with  the  IFT  access 
rule.  More  precisely,  under  a  hypothesis  paralleling  those  of 
Theorem  4.  one  can  build  a  control  scheme  based  on  a  binary 
feedback  empty /nonempty  such  that  the  Markov  vector  ( X„  Si)  is 
geometrically  ergodic  for  X  <  supXJc0  T(x).  Using  Theorem  3,  it 
can  be  seen  that  the  maximum  stable  throughput  is  the  same  for 
both  access  rules  when  the  new  packet  arrivals  are  Poisson 
distributed. 


*  -2  (j  J  mJ(l - F(i))-'C -Bc.  (A-2) 

2)  Two-Dimensional  Kaplan’s  Condition:  Consider  now  the 
multipacket  channel  with  a  general  control  algorithm  (1).  Then 
(X„  S,)  is  the  Markov  chain  of  interest,  and  the  relevant  Lyapunov 
function  is  V(n,  s)  =  n.  We  prove  again  that  Kaplan’s  condition 
holds  provided  that  (C„)„j.,  is  bounded.  From  (27],  it  is  enough 
also  in  this  case  to  show  that  the  downward  part  T{x)  of  the 
generalized  drift  is  bounded  below,  with  T(x)  - 
(V(y)  -  V(x)).  Given  a  state  x  -  (/,  s),  we  have 


T(x)=  PlX„.,  =  i-r,  S„,=k\X„  =  i,  S„  =  s] 

fl  k 
i 

=  _2  rP[X„+l=i-r\x„=i,  S„=j) 

r-1 

which  is,  in  the  same  way  as  before 

T(x)=  - 2  r  2  X"  S  0)  (F(s)VO-F(s)V-%'** 

r -  1  a »0  j~r+n  \J  / 


this  expression  is  similar  ti>  (A-l),  and  the  end  of  the  proof  is  the 
same  as  in  (A-2). 


Appendix 

Kaplan’s  Condition 

Consider  a  Markov  chain  with  denumerable  state-space  D,  and 
one-step  transition  probability  matrix  (P^ltr^eo-  Let  V(x)  by  a 
Lyapunov  function  on  D.  Then  the  generalized  Kaplan’s  condition 
holds  if  there  exists  a  positive  constant  B  such  that  for  all  z  6  (0, 
1{  and  all  x  G  D 

ZV{X)-  2 
ySD 
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Abstract 

This  work  studies  optics!  Code  Division  Multiple  Access 
(CDMA)  systems,  and  presents  the  exact  error  expression  for 
the  noncoherent,  single-user  matched-filter  receiver  based  on  the 
electron  count  in  a  symbol  period.  This  analysis  is  valid  for  arbi¬ 
trary  photomultipliers,  adheres  fully  to  the  semi-classical  model 
of  light,  and  does  not  depend  on  approximations  for  large  user 
groups  or  strong  received  optical  fields 

The  general  error  rate  expression  is  speeinhy.nl  It i  the  case 
of  unity  gain  photodetectors  and  prime  sequences,  and  the  exact 
minimum  probability  of  error  and  optimal  threshold  are  com¬ 
pared  to  those  obtained  with  simplifying  assumptions  on  user 
transmission  coordination  or  multiple-access-interference  (MAI) 
distribution.  We  find  that  the  approximation  of  chip  synchro¬ 
nism  yields  a  weak  upper  bound  on  the  true  error  rate,  and 
»e  demonstrate  that  the  approximations  of  perfect  o pticnl-to- 
electrical  conversion  and  Gaussian  MAI  yield  a r.  optimal  hypoth¬ 
esis  test  whose  error  rate  overestimates  the  true  minimum  error 
rate  and  underestimates  the  optimal  threshold  for  moderate  and 
large  received  optical  energies. 


Optical  CDMA  Model 


nT  <  t-Tj  <  (n  +  1  )T 

where  s  is  proportional  to  the  optical  energy  per  bit  of  the 
transmitting  laser,  v  denotes  the  optical  carrier  frequency 
(assumed  to  be  identical  for  all  users),  and  9}  is  the  phase 
ofTset  of  the  jth  laser  from  the  first  laser.  In  this  expression 
lP,(t)  is  a  standard  Brownian  motion,  and  aj  is  related  to 
the  jth  transmitting  laser  linewidth,  Bj,  by  aj  =  JlxB}. 
The  relative  delays  {ry}  are  defined  on  (0 ,T)  with  refer¬ 
ence  to  the  receiver  of  the  first  user.  With  dispersion-free 
transmission  (1)  also  represents  the  complex  scalar  field  at 
the  first  receiver  due  to  user  j. 

We  shall  assume  that  the  symbol  rate  of  each  user  is 
the  same,  the  optical  fields  of  the  I(  users  add  in  a  nonco¬ 
herent  fashion,  and  tha'  each  single-user  receiver  acquires 
the  timing  of  its  transmitter’s  symbol  epochs.  As  there 
is  no  cooperation  between  the  users,  it  is  appropriate  to 
model  the  remaining  relative  delays,  as  indepen¬ 

dent,  identically  distributed,  random  variables  that  are  uni¬ 
formly  distributed  on  the  interval  (0,T].  It  follows  that  the 
intensity  of  the  optical  field  at  the  receiver  of  the  first  user 
is 


The  digital  modulation  format  studied  in  this  paper 
is  optical  Direct  Sequence  Spread  Spectrum,  i.e.,  during 
each  symbol  interval  of  duration  T,  the  jtl‘  transmitting 
laser  is  amplitude-  modulated  by  the  product  of  the  data, 
which  takes  on  values  in  {0,1},  and  an  assigned,  sigua-  • 
ture  sequence  of  relatively  short  rectangular  pulses.  This 
scheme  divides  the  symbol  interval  into  N  equal  length 
subintervals,  called  chips,  on  which  tne  signature  sequence 
is  constant  and  takes  on  valuesin  {0, 1).  Further,  wedefine 
Pj  =  P  as  the  number  of  non-zero  chips  in  each  signature 
sequence,  6j  n  as  the  transmitted  symbol  of  the  jlfl  user  in 
the  interval  (n7',(n  +  1)T*),  and  c,(t )  as  a  periodic  repli¬ 
cation  of  the  signature  sequence  of  the  user  sucli  that 
cj(0.  <  6  [n7',(n+  1)7')}  is  the  j,h  signature  sequent  c  for 
mv  rixed  integer  n.  Then  the  transmitted  complex  scalar 
icld  from  the  i,h  laser  may  be  expressed  as 


IKOI'  =  y-  ]C6r.-ic;(‘  -r;)J''(°>r;)  +  A;,oc;(t  -  Ti)PtlTi<T) 
1  ;= l 

Where  pt(a.b)  a  rectangular  pulse  of  unit  height  with 
support  (o,fr).  Due  to  the  modulation  shown  in  (1),  the 
resulting  photon  point  process  depends  on  the  data  61,0 
only  on  the  set  {f|ci(<)  =  1,  0  <  l  <  T }.  A  commonly  used 
receiver  for  this  channel  is  the  noncoherent  matched-filter, 
which  sums  the  photon  counts  in  each  of  the  nonzero  chip 
subintervals  of  the  user  of  interest.  Given  that  the  function 
C|(<)  takes  values  on  {0, 1 } ,  the  correlation  operation  would 
be  easily  achieved  at  extremely  low  chip  rates  by  an  electro- 
optic  modulator,  which  would  allow  received  light  to  pass 
ouly  when  ci(t)  =  1.  A  more  effective  device  to  achieve 
the  matched-filtering  operation  at  higher  chip  rates  is  the 

liber  ontic  tap  delay  line,  which  uses  the  finite _ 
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ir<i|i-'ig.iiinn  velocity  of  liglil  to  achieve  the  pioper  iel 

Hive  delay  of  two  optical  signals  by  passing  them  through 
ibers  of  different  lengths.  The  matched-filter  direct- 
ictection  receiver  has  been  studied  in  several  experiments 
1,2]  and  will  be  the  CDMA  receiver  analyzed  in  t In',  work 


<  lo.ss  i  orri'latiniiK 

H},l(r)  ^  %  j\,(l  -  r)c,(t)dt 

^  jT  cAl  -  T)c^l)dt 

Hat  represent  the  contributions  to  the  conditional  mean  A 
iuc  to  the  i,h  signature  sequence  for  the  duration  of  j 
uni  b.to,  respectively.  Also,  d  represents  the  portion  of  the 
primary  electron  count  mean  due  to  thermoelectrons.  In 
lie  remainder  of  this  work  we  set  the  auantum  efficiency  of 
Hie  photodetector  to  unity,  as  this  effects  the  distribution 
of  Af  only  through  an  attenuation  of  intensity.  Further,  we 
set  x  =  osb^o  =  0  under  hypothesis  Wo  and  x  =  s  under 
hypothesis  H\. 


lcwft  J,  Optical  Noncoherent  Mttched.FlUtr  CDMA  Rectlver 

vs  shown  in  Figure  1,  the  total  received  optical  signal 
it)  is  coupled  to  a  1  xP  beam  splitter.  Each  of  the  outputs 
>i  the  solittcr  arc  identical  copies  of  the  input  signal,  only 
attenuated  in  intensity  by  P.  These  signals  arc  input  to  the 
ao  delay  line.  The  function  of  the  f'A  tap  is  to  delay  the 
cccivcd  field  so  that  tiie  ontical  signal  in  the  i1*  non-zero 
■nil)  of  the  first  signature  sequence  overlaps  in  time  with 
nc  last  ( P'h)  non-zero  chip  of  the  same  bit  interval  in  the 
maelavcd  signal.  Thus,  the  first  tap  requires  more  filiet 
aule  than  the  second.  The  tapped  signals  arc  uom olier- 
•nttv  recombined,  and  the  output  optical  signal  is  incident 
>n  tnc  nhoiodctcctor.  To  decide  on  the  value  of  Ago,  we 
He  tlic  secondary  electron  count  during  the  last  non-zero 
mo  interval  of  the  first  signature  sequence  For  the  re- 
iidinucr  of  this  work  we  denote  this  sccondan  electron 
oiiih  in  A  We  shall  employ  a  common  photoiuul 1 1 plior 
nouel.  in  which  the  intensity  of  primary  clc<  Irons  is  given 
>v  r»lr(/)|'  +  if,  where  o  is  proportional  to  ihe  qii.iiitnin 
efficiency  of  the  photodetector,  and  if  denotes  the  rate  ol 
primary  electrons  due  to  an  independent  dark  current  1  he 
n‘A  primary  electron  yields  a  random  number  of  secondary 
(output) electrons  j„,and  the  collection  {ffn}  is  assumed  to 
be  mutually  independent,  identically  distributed,  and  in¬ 
dependent  of  the  photon  or  primary  electron  point  process 
,dj.  The  common  probability  generating  function  of  (</„} 
is  denoted  as  G(z)  =  ]T£Lo  P**  •  In  this  case,  .V  is  uuidi- 
l  ion  ally  compound  Poisson  given  the  integrated  iiiIchmIv, 
which  we  define  as  A,  and  the  distribution  of  A'  depends 
only  on  G(z)  and  the  integrated  intensity  A,  given  !>\ 


\  -  os  fr|  i)  -4  d  +  ~  y]  h,.-i  R  ,.i( T, )  + 1> i  n  R,  1 1 ",  1  (-’1 


where  Rj.i'j)  and  /f;j(r)  are  the  normalized  (paiti.il! 


derivation  of 


rn  this  section  we  obtain  the  general  expression  for  the 
”M  F  of  the  secondary  electron  count  Af,  at  the  Integra^ 
tor  output  for  an  arbitrary  photomultiplier  and  for  syu- 
chronous  or  asynchronous  transmission.  We  will  use  this 
"esult  in  a  later  section  to  compare  the  error  rates  under 
various  simplifying  approximations  to  the  exact  error  rate. 
Mso,  tae  form  of  the  general  expression  will  be  used  in  the 
next  section  to  develop  arbitrarily  tight,  computationally 
'fficient  bounds  on  the  cumulative  distribution  function  of 


in  the  following,  we  define  M  as  the  upper  bound  on 
rite  set  of  total  cross-correlations  R} *  +  R}>t ,  and  as  the 
signature  sequences  are  from  {0,1}^,  these  bounds  hold 
lor  the  partial  cross-correlations  as  well.  Since  the  rela¬ 
tive  delays  are  uniformly  distributed  and  the  chip  wave- 
mini  is  rectangular,  it  is  straightforward  to  show  that  each 
vi oss- correlation  is  a  mixed  random  variable  whose  mea¬ 
sures  have  point  masses  on  the  integers  {0,1,. ...Af)  and 
continuous  portions  that  are  constant  between  these  inte¬ 
gers  We  shall  employ  the  following  notation 


V 


<*,(■) 


i6  (0.1,... ,  A/} 


P 


II j'l  6  [n,  v  +  dv) 


=  c;(i)  dv 


[v,ti  +  dv)  6  (i,  i  +  1) 


and  we  denote  the  distribution  of  R}i\  as  {d;(0),d;(l), ..., 
<f;(d/),c;(0),...  ,c;(Af  -  1)}.  Thus  the  marginal  distri¬ 
bution  of  each  cross-correlation  is  completely  specified  by 
2.M  parameters.  Further,  the  superscript-T  notation  will 
lie  used  to  distinguish  the  distribution  of  the  total  cross- 
(orrelation  Rh\  +  RJ:i  from  that  of  R}t i,  and  the  hat  no- 
union  will  be  used  for  the  distribution  of  RJ: j. 


Onr  .approach  to  finding  the  PMF  of  At  is  I  hr  follow 
ing:  we  will  derive  the  z-transform  of  At  from  its  condi¬ 
tional  compound  Poisson  nature,  and  then  show  lli.il  tins 
z-lransform  has  a  particularly  straightforward  and  explicit 
Maclaurin  series  expansion.  The  PMF  is  the  collection  of 
coefficients  of  this  series,  and  may  he  explicitly  represented. 

By  conditioning  on  (x,  {lljj ,  R},i},  ]  =  "2,...  A'),  the 
count  Af  has  a  compound  Poisson  distribution,  whose  x- 
tiansform  is  given  by 


There  are  2M+2  terms  inside  of  the  braces,  betting 
iij  index  the  number  of  occurrences  of  D(q),  and  the 
number  of  occurences  of  [C(q-  1)  -C(g))  in  a  multinomial 
expansion,  we  rewrite  (5)  as 


exp[(G(x)  -  l){x  +  d-h  pE$i0q[n,  +  m,))) 
(1  -  C(z))£'-*m' 


(6) 


2  =  2.  •  A’]  =  (3) 

K 

e(*+d)(G(o-i»  x  J|e^{t,.-.nJ,i+»,.«A,.i)(C(.-)-i) 

1= 2 


where  the  outer  summation  is  over  all  the  indices  such 
that  JZjLo  mt  +  nj  —  K  -  1.  We  find  the  PMF  of  At  in 
the  following  way.  Suppose  that  we  knew  explicitly  the 
coefficients  of  the  following  power  series 


Due  to  the  mutual  independence  of  the  pairs  {fij.i.  RJti } 
we  need  to  determine  only  the  expectation  of  each  fac¬ 
tor  in  (3),  as  the  factor  depends  only  on  the  random 
mixture  6Ji _ 1 72 j  j  -f  It  is  clear  that  the  random 

mixture  has  the  same  kind  of  distribution  as  R}i |,  and 
we  denote  this  mixed  distribution  by  (Dj(0), 
Dj(M),Cj(0),...  ,Cj(M  -  1)).  With  this  notation,  the 
closed  form  expression  of  the  power  scries  of  inierest  is 


E  [W  x  =c(G(t)-l)(x+J)x 


n 


;=5 


,\l 


D_,(?)exp(7-^(G(z)  -  1))— 

1=0 


p  e(G(*)-Up  _  1 
s  1  -  CJ(z) 


Af-l 

£ 


r=0 


r»(')cxp(r(G(.-)-  I)£) 


(I) 


Vfl  are  interested  in  finding  V\ At  =  n  |  xj,  which  is 

he  coefficient  of  xn  in  the  nower  scries  of  (>l)  about  the 
■ngin.  This  power  scries  is  straightforward  but  mincrcs. 
•xriiv  general  for  most  signature  sc<|iicnco  sets  ofinleresi 
'or  example,  the  number  of  parameters  m  the  power  se- 
ics  is  reduced  bv  a  factor  of  K  -  1  by  assuming  that  the 
narginais  of  R,  j ,  R} j  and  R}j  +  Rlt\  are  indepeiuleni  of  ), 
e  .  the  contribution  of  user  j  to  the  MAI  is  stati.siie.illy  iii- 
nsimguishable  from  the  other  interferers.  Wc  have  verified 
tiat  this  is  an  excellent  approximation  when  the  signature 
eoucnccs  come  irom  the  prime  codes,  and  will  drop  the 
subscript  from  the  distribution  of  the  random  mixture.-,  i.i 
the  sequel.  Also,  the  power  series  of  this  expression  is  eon- 
eiscly  written  if  we  define  C(  -  1)  =  G(  Af)  =  0  Wi  l  la  these 
simplifications,  (<l)  becomes 

E  (>'|*]  =e(GW-D(*+d)x  r>) 

af  “  ■ 1 

£  D(7)c,,^(gw~i)  - 
1=0 

7TTc(7j  Dc<->  -  »  - 


£  72«(«,  «,/?)*"  S  0  6  nc,  (7) 

Using  (7)  in  (6)  we  express  the  PMF  for  At  as 


1 1(7=0  n=Q 


M 


~  {C('i  -  I)  -  C('/)} 


Af 


Tic*  n,{xa'-(l+l^?(n, +  (8) 


1=0 


<7=0 


All  that  remains  to  be  determined  is  an  explicit  expres¬ 
sion  for  the  coefficients  Ties/  of  the  power  series  in  (7).  In 
the  following  we  show  that  Ties  may  be  calculated  by  a 
linear  recursion  on  the  integers  n  and  0. 

"lie  recursion  for  Ties  is  most  easily  seen  by  substitut¬ 
ing  the  identity 

eo(C(r)-l)  ^  ^  e«(C(*)-l)  e°(C(i)-l) 

fl-C(z))*+»  “  (1  -  <?(*))*+»  +  (1 -(?(*))* 

where  0  6  {0,1,2,...},  into  the  definition  for  Ties  (7) 
This  yields 


n+l 

:  1  -  po )Res(n  +  1 , or, +  1)  =  ^  p;7 les(n  +  1  -  /,  a,0  +  1) 

■=i 

!■ -!les(n  +  ,  *»,/?€  {0, 1,2,. . .}  (9) 

where  G(x)  =  F°r  most  photomultiplier  models 

p0  =  0,  which  we  will  assume  in  the  sequel.  The  initial 
conditions  of  this  recursion  arc  also  easily  extracted  from 
the  definition  of  ties, 


Tles(O,a,0)  s=  e~a  ,  0  6  {0,1,2,...}  (10) 


"a1  r  * 

R.cs(n,a, 0)  =  ^  Tfe”°  V  E5'  ”  n 

1=0  K ■  l(=l 


,  nS  {0,1,...}. 


The  linear  recursion  for  Ties  on  n  and  0  permits  fast,  ef¬ 
ficient  computation  for  any  arguments  n,0  >  1.  Note 


that  tlic  second  initial  condition  lor  tins  leuiiMun  do 
pends  on  p|s3/=ifft  =  n|  ■  which  must  ho  icnown  for 
n,Jt  6  {0,1,2,...}.  These  probabilities  require  iterated 
convolutions  of  the  PMF  of  the  random  gain  ;//.  m.iy  be 
precomputed  and  stored  for  small  n  and  k,  and  mav  be 
accurately  approximated  online  for  large  n,k.  We  are  nat¬ 
urally  interested  in  special  cases  where  01  ~ 

lias  an  explicit  form  -  it  is  easy  to  show  that  this  i.s  the 
case  for  random  gains  that  are  shifted  Poisson-distributed, 
as  well  as  for  the  unity  gain  case. 

Arbitrarily  Tight  Bounds  on  p|Ar<  n  |  j| 

Computationally  efficient  bounds  must  reduce  the  com 
plexity  of  (S)  in  both  the  multinomial  summation  and  the 
computation  of  Tics,  while  controlling  the  loss  of  accuracy 
by  a  parameter  of  our  selection.  In  this  section  we  show 
that  by  quantizing  the  random  mixtures,  we  arlueve  all 
tliiee  objectives. 

The  complexity  of  the  PMF  is  due  to  the  smoothing 
over  the  joint  distribution  of  the  random  mixtures  we 
originally  conditioned  on  these  random  variables  to  take 
advantage  of  the  conditional  compound  Poisson  nature  of 
A'.  We  could  have  also  conditioned  according  to  the  uni- 
dilional  mean,  A,  for  which  A f  is  also  compound  Poisson 
However,  the  exact  dislurbutiou  of  the  conditional  mean  A 
is  not  easily  obtained,  as  it  is  formed  by  the  convolution 
of  1\  -  1  mixed  distributions.  It  is  obvious  that  if  the  con¬ 
volved  distributions  weie  discrete,  say,  with  <3A/+  1  points, 
then  the  exact  distribution  of  A'  would  be  straightforwaid 
to  compute.  More  importantly,  the  distribution  of  A  would 
'ake  on  (K  -  l)Q\I  +  1  points,  rather  than  a  number  t li.it 
is  exponential  in  the  number  of  inlcrferers. 

But  how  do  we  obtain  bounds  on  p|.V  <  « j  jcJ  that  use 
a  discrete  distribution  on  A,  and  are  arbitrarily  light?  .Sup¬ 
pose  we  quantize  the  random  mixtures{6;i_,  7i;j  +1>;  y  hj  t } 
with  a  ^  quantization  step  size,  <3  6  {1,2,...},  and  round¬ 
up  or  round-down  to  form  bounds  on  the  random  mixtures 
That  is,  we  form  A;,AU  given  l»v 

,,  *  |  , 

A<  =  *  -  d  ~  q  W  ",  .  J  +  >>,M  £ lh>  II,  l  i 


A  subtle  point  is  raised  by  considering  the  fo/in  of  A' 
"(A) 

=  £  Op 
p=i 

where  11(A)  is  the  conditionally  Poisson  number  of  pri¬ 
mary  electrons  with  conditional  mean  A.  Since  gp  are 
non-negative,  we  have  that  ,\f  is  an  increasing  function 
of  the  primary  electron  count,  II.  It  is  not  clear  that 
(a.s.)  bounds  on  A  produce  similar  bounds  on  il(A),  as 
7>|lT(Af)  >  11(A)  J  zj  >0,  and  this  representation  of  A/- 

does  not  guarantee  bounds  on  V  Lv"  <  n  |  zj .  In  the  lemma 
below  we  use  a  statistically  equivalent  representation  of  M 
to  show  that  we  may  achieve  bounds  on  V  |//  <  n  J  x j  by 
using  the  distributions  of  A;,AU. 

Lemma.  Let  11(A)  be  a  conditional  Poisson  random 
variable  with  mean  A  given  A,  and  let  A/(A)  =  Oi , 

wheie  {<7*}  are  independent,  identically  distributed,  non- 
negative  integer-valued  random  variables.  Let  A'  <  A,  a.s. 
Then 

P^V(A)  <  nj  <  <  n  ,  n  >  0. 

Proof.  We  recall  that  p*  =  V |p;  =  k  ,  and  de¬ 
fine  {-Wx(A 'pt),  Ih(Api)}  to  be  a  set  of  conditionally 
mntually-indcpendent,  Poisson  random  variables  with  the 
indicated  means  given  (A,  A')  so  that  11(A)  =  Ylkai 
Under  this  conditioning,  A/"(A)  has  the  same  distribution 
as  (-Ij 

Af(A)  =  f)  *IU(Apt) 
t=  1 

ft  is  straightforward  to  show  that  if  {X\ ,  Xj,  Y\ ,  !•>)  are 
conditionally  mutualfy-independent  random  variables  given 
A,  A'  and 


f>(.V,  <  »|a,a']  <c[y,<»|A,A'j  ,  1=  1.2 
ii  the  same  is  true  lor  the  sum 
p[a'i  +  X2  <  n  |  A,  A'j  <  +  Y2  <  n  j  A,  A'j. 


Since  the  Poisson  CDF  is  a  decreasing  function  of  the  mean, 
wo  have  for  /  =  I 


S  *  l  I  r[y>IU(Ap*)<n  A,A,|<pfT'*Afjt(A'pjt)<n  A. A'j 

Au  =  x  T  d+  —  ^2  6;-_|  — fQRj 1  6;0— \Qll,  |i  Usi  i  Uoi 


where  [/?J  ((7F|)  is  the  greatest  (least)  niicgei  fuuciion 
of  It.  Then  it  is  obvious  that  Aj  <  A  <  A„,  Inn  «.m  w<< 
nse  A,,,A(  to  form  bounds  on  the  secondary  election  <m<nl 


The  same  is  tree  for  the  unconditioned  CDFs  by  smooth¬ 
ing.  The  same  holds  for  finite  /  by  induction  on  the  above 
fact,  and  for  /  —  oo  by  monotone  sequential  continuity  of 
the  probability  measure* 


Rxninpb-:  Prime  .Soqnoncox  „n<l  PIN  Pliolo<li«nl<-s 

A  necessary  prerequisite  to  the  comparison  between  er¬ 
ror  rates  of  the  CDMA  matched  filler  receiver  is  the  com¬ 
putation  of  the  random  mixture  distribution  (/?(()),..., 
D(M),C(0),...,C(\l  —  1)),  as  seen  in  (3).  These  are  com¬ 
puted  by  a  knowledge  of  the  signature  sequences,  a.-,  vx cl! 
as  the  distribution  of  the  relative  delay.  Since  the  cioss- 
corrclations  of  prime  sequences  arc  bounded  above  In  [5] 

M  =  2,  we  must  compute  (D(0),D(1),D(2),C(0),C(1)) 
for  the  chip-synchronous,  and  asynchronous  cases.  Tor  the 
prime  sequences  from  GF(31),  vve  have  found  that  the  av¬ 
erage  distributions  for  the  random  mixtures  are 

(D(0),  D(l),  D(2),  C(0),  ('( 1 )) 
chip  synchronous  =>  (.57,  .36,  .07,  ,00,  .00) 
asynchronous  =t-  (  -M, .22,-01.  2-1, .00) 

As  noted  earlier,  we  have  verified  that  the  MA!  for  prime 
sequences  is  well-modeled  by  a  sum  ofimlependcnt,  idenlicallv 
distributed  (IID)  random  variables  in  the  s.  use  that  the 
mean,  variance,  and  third  central  moment  ol  ihe  MAI  us¬ 
ing  the  IID  assumption  and  the  average  di.si  ucumon  were 
identical  to  the  exact  MAI  moments,  while  lh<  fourth  <cu- 
’ral  moments  differed  by  less  than  .00-1%  for  29  interferes. 
Further,  these  distributions  did  not  differ  sigiiifkautli  for 
the  prune  scipieuces  fioin  G l'( 1 1)  .iml  C.T(I7),  .u«l  w  uv 
iliese  distributions  for  all  calculations 

lu  Figure  2  we  have  plotted  the  minimum  error  prob 
ability  of  the  matched- filter  CDMA  receiver  for  the  chip- 
synchronous  assumption  and  for  completely  asynchronous 
transmission.  We  have  used  the  weight  17  and  length  2*') 
pnmc  sequences  from  CF(I7),  a  received  optical  energy  of 
-,  =  1000  photons  per  bit,  and  a  dark  current  contnhunor 
n  i|=.",0  thermoelcctrons  oer  bit.  For  a  bit  rate  of  It  Inis 
urn  sreona,  these  numbers  correspond  to  a  peak  received 
>ower  oi  R  IO~7mH'  and  a  photodetector  d.uk  ciirient 
a  approximately  R  10“8nA  From  Figure  2  vve  sec  that 
i  Hus  particular  case  the  chip  synchronous  approximation 
ipper  Dounds  the  error  rate  in  the  asynchronous  case  by 
t  least  one  order  of  magnitude 

"lie  error  rates  are  ordered  in  tins  wav  due  cmIiimvoIv 
o  the  differences  of  the  distsibniions  of  the  i.uulom  nu\ 
i.ies  siiown  above.  Note  that  the  means  of  tin  i.imloiii 
..vures  are  identical  in  both  cases,  while  the  unit  nug  of 
he  variances  coincides  with  that  of  the  error  rates.  Thus 
he  MAI  has  identical  means  under  these  distributions,  and 
lonci  moments  whose  ordering  coincides  with  that  of  llie 
rror  rates  It  is  easy  to  show  that  ii|A'|*]  =  A.  and 
'arf.M|z)  —  ,\  -  (A)3  +  A-,  wi  ich  implies  hat  under  «m.  h 


hypothesis  on  x  the  mean  of  A/  is  untlianged  by  the  approx¬ 
imation  of  chip  synchronism,  yet  the  variance  of  A/-  given 
x  increases  as  we  proceed  from  complete  asynchromsin  to 
chip-synchronism.  From  the  ordering  of  the  minimum  er¬ 
ror  rate  curves  in  Figure  2,  we  see  that  an  increase  in  the 
variance  of  Af  under  each  hypothesis  results  in  an  increased 
ciror  rate  as  the  conditional  means  of  AA  are  fixed. 


Fljur*  L  CompirUon  ot  ihi  MlfJ mum  Error  RiIm  For  Compltle 
Ajynchroflbm  uvJ  Chip  Synchronoua  Approximation 

Direct  detection  systems  often  require  large  received 
optica]  energies  to  achieve  an  acceptable  error  rate  when  a 
PIN  photodiode  is  used,  so  we  are  interested  in  the  asymp¬ 
totic  distribution  of  (a  scaled  version  of)  Af .  The  question 
is  more  formally  worded  as.  if  A/is  a  conditionally-Poi.,sor 
random  variable  with  mean  A  given  A,  and  tend 
in  distribution  to  a  random  variable  <}>  as  some  paramete.i 
grows  without  bound,  what  docs  the  distribution  of  - r-~ - 
tend  to?  In  the  simple  case  when  A  is  deterministic,  it  is 
well  known  that  the  normalized  count  converges  in  distri¬ 
bution  to  a  standard  Gaussian  random  variable.  Is  this  the 
-.iso  in  general? 

The  answer  was  solved  independently  by  Serfozo  (G) 
and  Grande!!  [7]  for  the  special  case  when  A  -*  oo,  and 
depends  on  the  limit  p  defined  as  Iim<7?t/A.  If  p  =  0. 
then  the  normalized  count  converges  in  distribution  to  a 
standard  Gaussian.  If  p  =  oo,  then  the  normalized  count 
converges  in  distribution  to  <t>.  Finally,  if  0  <  p  <  oo, 
then  the  normalized  count  converges  in  distribution  to  an 
independent  mixture  of  a  standard  Gaussian  and  <j>, 

'ii  oil,  case,  the  parameter  is  the  received  signal  en¬ 
ergy  per  bit,  s,  and  the  condition  A  —  oo  is  satisfied  as 
\  is  proportional  to  s.  It  is  this  fact  that  also  sets  p  to 
-o.  and  we  have  from  the  result  above  that  for  large  signal 
I’ueigies  the  normalized  count  converges  in  distribution  to 
the  scaled  conditional  mean  <t>.  This  asymptotic  result  is  a 
weake.  form  of  what  is  more  commonly  known  as  “perfect 
opiit a'-to- electrical  conversion”,  in  which  the  integrated 
pliotocurrent  is  equal  (a.s.)  to  the  integrated  optical  in¬ 
tensity  It  will  be  seen  in  the  numerical  results  presented 


iH'Xt  tli.it  tin*  asymptotic  Ktalistii  is  far  from  In'in}*  a  «1«* 
terministi'c  signal  in  Gaussian  noise,  as  the  MAI  is  far  from 
Gaussian  even  for  a  moderate  number  of  users. 


Hr^3.  KiMtKmrlitH  n. 

Zmt  sut««  f«rV4rt*«a  AffraxlawtUu 

In  Figure  3  we  have  compared  the  minimum  error 
rates  of  the  CDMA  matched-filter  receiver  based  on  per¬ 
fect  optical-to-electrical  conversion  (the  high  energy  limit) 
to  those  for  the  true  distribution  of  jV  «.t  various  finite  op¬ 
tical  energies.  In  this  example  we  have  used  the  prime  se¬ 
quence  from  GF(ll).  Also,  we  have  plotted  the  minimum 
error  rate  under  the  additional  assumption  of  Gaussian- 
distributed  MAI.  We  note  that  even  for  modest  received 
optical  energies  of  10,000  photons  per  bit  the  error  rate 
exceeds  that  predicted  by  the  asymptotic  distribution  by 
at  least  an  order  of  magnitude.  Figure  3  shows  that  the 
minimum  error  rate  is  a  decreasing  function  of  the  received 
optical  energy,  as  expected.  Further,  we  note  that  a  Gaiis- 
ran  assumotion  on  the  MAI,  together  with  the  perfect 
inncal-to-electrical  assumptions  is  a  poor  estimate  of  the 
rue  minimum  error  rate  curve,  except  for  user  group  sizes 
cxceeuing,  say,  10  users.  In  particular,  this  assumption 
iverestimales  the  error  rate  for  moderate  to  large  incident 
■oiical  energies. 

vs  a  result  of  the  perfect  optical-to-electrical-conversion 
-oproximation,  the  boundedness  of  the  MAI  leads  to  an 
*'rror-free”  condition  for  sufficiently  small  numbers  of  in- 
erferers.  This  occurs  since  the  supports  of  the  conditional  * 
dstributions  of  the  test  statistic  are  disjoint  under  these  as¬ 
sumptions.  Since  prime  sequences  have  cross  correlations 
that  arc  bounded  above  by  2,  the  necessaxy  condition  for 
prime  sequences  is  K  -  1  <  P/2.  This  assumption  pre¬ 
dicts  zero  error  rate  for  K  <  6  in  Figure  3,  which  indicates 
that  the  perfect  optical-to-electrical  assumption  accurately 
predicts  the  “error-free  condition"  only  for  incident  optical 
mergies  exceeding  10,000  photons  per  bit  -  the  error  rate 
or  K=6  at  this  energy  is  rough!'-  I0-1-*. 

n  Figure  <1  we  have  plotted  the  optimal  thresholds, 
lormaiized  by  the  signal  energy-,  s,  for  those  erioi  rate- 


i  urvvn  plotted  in  Figure  3.  Ah  tho  incident  optical  en¬ 
ergy  per  bit  increases,  the  normalized  optimal  threshold 
increases  to  unity,  which  is  the  curve  corresponding  to 
the  asymptotically  optimal  test.  Note  that  the  Gaussian 
MAI,  perfect  optical-to-electrical  approximation  predicts  a 
threshold  that  significantly  underestimates  the  true  opti¬ 
mal  threshold  for  those  incident  optical  energies  needed  to 
dominate  the  dark  current. 


Observe  that  the  asymptotic  test  yields  a  more  accu¬ 
rate  estimate  of  the  optimal  threshold  for  moderate  signal 
energies.  Optimal  thresholds  for  large  incident  optical  en¬ 
ergies  are  not  plotted  for  the  “error-free"  region  because 
they  could  not  be  reliably  determined  due  to  the  vanishing 
error  rate. 
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Abstract 

The  optimal  signal  design  problem  for  a  band-limited 
PAM  symbol-synchronous  Gaussian  two-user  multiple-access 
channel  is  investigated.  Using  the  root-mean-square  and  the 
fractional  out  of- band  energy  bandwidth  definitions,  we  find 
the  capacity  region  of  the  channel  and  the  signature  waveforms 
to  achieve  each  point  inside  the  capacity  region.  The  optimal 
pair  of  signature  waveforms  are  mirror  images  of  each  other, 
and  are  obtained  by  minimizing  their  cross-correlation  sub¬ 
ject  to  a  fixed  finite  duration  and  the  bandwidth  constraint. 
The  two-user  capacity  region,  in  the  rms  case,  is  found  to  con¬ 
tain  the  capacity  region  of  the  two-user  strictly  band-limited 
Gaussian  channel.  This  demonstrates  the  fact  that  by  relaxing 
constraints  in  the  frequency  domain,  we  can  introduce  struc¬ 
ture  (PAM)  in  the  time  domain  and  obtain  a  larger  capacity 
region. 


1.  Introduction 

The  capacity  region  of  the  two-user  discrete-time  Gaus¬ 
sian  multiple-access  channel 


y,  =  xu  +  i2,  +  n; 


where  n,  is  an  i.i.d.  Gaussian  sequence  with  variance  equal  to 
a2  and  the  energy  of  each  codeword  is  constrained  to  satisfy 


^  fc  =  1-2 


i=i 


FSK,  etc.  In  the  case  of  PAM  (Pulse  Amplitude  Modulation), 
the  ktb  user  is  assigned  a  fixed  deterministic  waveform,  $t(t). 
which  is  time-limited  to  (0,7]  and  is  modulated  by  the  in¬ 
formation  stream.  Then,  assuming  that  the  transmitters  are 
symbol-synchronous,  the  PAM  two-user  multiple-access  chan¬ 
nel  becomes 

n 

»(0  =  6l(‘)Jl(t  -  lT)  +  h(‘M‘  -  >T)  +  n(t)  (3) 

1=1 


where  n(t)  is  white  Gaussian  noise  with  spectral  density  o2 
and  {6i(»)}  is  the  symbol  stream  transmitted  by  the  fcth  user 
Assuming  that,  without  loss  of  generality,  the  signature  wave¬ 
forms  have  unit  energy,  the  energy  constraints  on  the  trans¬ 
mitted  waveforms  become 

£i>*(0  <  «k  =  TSk  k.  1,2  (4) 

"  i=l 


It  is  easy  to  show  that  if  si(t)  =  S2(t),  then  the  capac¬ 
ity  of  (3)  under  constraints  (4)  is  equal  to  the  Cover-Wyner 
pentagon  (1)  (this  result  remains  true  even  if  the  users  are 
completely  asynchronous  [3].;  If  the  signature  waveforms  are 
not  necessarily  identical,  then  the  Cover-Wyner  pentagon  gen¬ 
eralizes  to  [4] 


CV  = 


0  <  «l  <  2  log[l  +  ] 

Ri  +  R2<  llog[l  +  S$tt 
“p(l  -P2)} 


(5a) 


is  equal  to  the  Cover-Wyner  pentagon  [l],  (2): 

f  «<!?!<  iiog(i  +  an  | 

CD  =  {(JZi.Aa) :  «<  f.2<}l°g(l+^]  (I) 

[  R\  +  Ri<\  kgU  +  HJ^r2]  j 

in  information  units  per  channel  use.  Analogously,  the  capac¬ 
ity  region  of  the  continuous-time  band-limited  channel  with 
noise  power  spectral  density,  bandwidth  and  user  signal 
power  equal  to  <72,  B,  and  St  respectively,  is  given  by  [2],  as 
(in  units  per  second) 


I  0  <  #1  <  B  log(l  +  jyfjg  ] 

Cc  =  <  (Ri,  R2)  •  0  <  R2  <  B  log(l  +  $%} 

{  Rl  +  R2<Blo^it^} 


(2) 


This  capacity  regioa  is  achieved  by  approximately  band- 
limited  and  approximately  time  limited  waveforms  which  have 
no  particular  structure.  In  order  to  deal  with  modulation  and 
demodu!a.,.on  schemes  with  manageable  complexity,  it  is  cus¬ 
tomary  ,u  digital  communications  to  introduce  structure  on 
the  transmitted  waveforms  by  slotting  the  time  domain  into 
intervals  of  length  T  and  sending  a  symbol  in  each  slot  by 
means  of  a  digital  modulation  format  such  as  PAM,  PSK, 
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in  information  units  per  channel  use  or 


Cy  = 


0<  <^108(1+^] 
(RuR*):0-112-^1^}^ 

K  U  21  R\+R2<  ^log[l+ 

^(1-p2)]” 


(5b) 

in  information  units  per  second,  where  P  =  fo  s\(t)»2(t)dt  is 
the  cross-correlation  between  the  signature  waveforms. 


A  natural. question  to  address  is  the  choice  of  the  unit- 
energy  waveforms  sj(t)  and  $2(0  to  maximize  the  capacity 
region  Cy.  It  is  clear  that  the  unconstrained  solution  is  to 
choose  orthogonal  signature  waveforms.  Then,  p  =  0,  and  the 
multiple-access  channel  is  decoupled  into  independent  single- 
user  channels,  and  each  transmitter  can  transmit  at  single-user 
capacity.  However,  in  practice,  there  are  constraints  on  the 
choice  of  the  signals  (e.g.  in  Spread  Spectrum  CDMA  systems, 
the  waveforms  may  be  constrained  to  be  Pseudo  Noise  shift 
register  sequences  of  given  period,)  and  it  is  not  always  possi 
ble  to  assign  orthogonal  waveforms  for  all  users.  In  this  paper, 
we  will  address  the  optimization  of  the  signature  waveforms 
and  their  duration  T  under  bandwidth  constraints.  Since  the 
signature  waveforms  are  strictly  time-limited,  they  cannot  be 
strictly  band-limited,  and  the  need  arises  to  quantify  the  band¬ 
width  of  these  signals.  There  are  several  established  ways  to 
accomplish  this  (5j.  In  this  paper,  we  will  consider  the  two 
bandwidth  measures  of  baseband  signals  that  have  received 
most  attention  from  the  information  theoretic  community:  the 


root  mean  square  (rms)  bandwidth  and  the  fractional  out-of- 
band  energy  (fobe)  bandwidth. 

The  rms  bandwidth  was  popularized  by  Gabor  (6)  (it  is 
sometimes  referred  to  as  Gabor  bandwidth)  and  studied  sub¬ 
sequently  in  (5],  (7],  and  (8].  A  finite-energy  signal  s(l)  has 
rms  bandwidth  B  if  its  Fourier  transform  S(f)  satisfies 


S-o O  f2\S(f)\2df  _2 

r„|S(/)i2d/  " 


(6) 


and  the  T-shifts  of  s(f),  {a(f  -  iT)}JL|,  form  an  orthonormal 
set.  The  projections  of  y(t)  on  this  orthonormal  set  are  equal 
to 

/(■+ or 

1/(0  =  /  y(t)a(t-iT)dt  i=  1 . n  (12) 

JtT 

or,  substituting  y(t)  from  (10), 

y(t")  =  b(i)  +  n(i)  (13) 


i.e.  the  rms  bandwidth  is  the  square  root  of  the  “second  mo¬ 
ment”  of  the  energy  spectral  density  (|5(/)|2)  of  the  normal¬ 
ized  signal  or.  proportional  to  the  square  root  of  the  energy 
of  its  derivative, 


i  /ga&Wgft  o2 

(202  /“«*2(0 dt 


(0 


The  fobe  bandwidth  has  been  used  in  e.g.(5),  [8]  and  is 
defined  as  the  bandwidth  necessary  to  encompass  a  given  frac¬ 
tion  (say  a)  of  the  signal  energy,  i.e.  the  a-fobe  bandwidth  is 
B  if 

(E\S(f)\2df  =  a  r  \S(f)\2df  (8) 

J-B  J-ao 

Notice  that  the  bandwidth  constraints  imposed  on  the 
signature  waveforms  will  be  inherited  by  the  transmitted  sig¬ 
nals  because,  as  is  well  known  (9],  the  power  spectral  density 
Xj,  M0*i(*  -  i T  -  r)  where  r  is  uniformly  distributed  in 
[0,T]  and  {!>*(«)}  is  an  i.i.d.  sequence,  is  a  scaled  version  of 
the  energy  spectral  density  |5t(/)|2. 


2.  Single-user  Channel 

Before  solving  for  the  capacity  region  of  the  PAM 
multiple-access  channel  under  bandwidth  constraints,  it  is  en¬ 
lightening  to  examine  the  PAM  single-user  channel  with  con- 
s  rained  rms  bandwidth.  This  channel  differs  from  the  classi¬ 
cal  band-limited  Gaussian  channel  in  that  the  allowable  trans¬ 
mitted  signals  1)  have  much  more  structure  (PAM)  and  2)  are 
rms  band-limited  but  not  strictly  band-limited.  It  turns  out 
that  the  effect  of  the  laxer  bandwidth  measure  cancels  the  ef¬ 
fect  of  the  additional  structure  imposed  on  the  transmitted 
signals  in  the  time  domain,  and  the  capacity  of  the  channel  is 
given  by  the  celebrated  Shannon  formula  (10). 

Theorem  2.1. 

The  capacity  of  the  single-user  PAM  white  Gaussian  chan¬ 
nel  with  noise  power  spectral  density,  rms  bandwidth  and  sig¬ 
nal  power  equal  to  a2,  B,  and  S  respectively  is  given  by  fin 
units  per  second) 

=  B  108(1  +  5^1  0) 


Proof. 

The  single-user  PAM  white  Gaussian  channel  is  a  special 
case  of  (3): 

*(<)  “EWW “iT)  +  "(0  (10) 

i=i 

Assuming  that,  without  loss  of  generality,  s(f )  has  unit  energy, 
the  power  constraint  becomes 

if>2(i)  <  TS  (11) 


where  (n(»)}  is  an  i.i.d.  Gaussian  sequence  with  variance  equal 
to  <r2. 

The  important  point  to  note  is  that  {j/(*)}^=i  are  suffi¬ 
cient  statistics  for  the  transmitted  messages;  therefore,  the 
capacity  of  the  PAM  channel  (10)  for  a  fix  T  coincides  with 
the  capacity  of  the  discrete  time  memoryless  channel  (13)  with 
constraint  (11),  which  is  given  by  (e.g.  (11])  (in  units  per  sec¬ 
ond) 

1  ST 

cs(T)=zr  logli  +  ^-l  (i4) 

Since  C$(T)  is  monotonically  decreasing  in  T,  the  ca¬ 
pacity  is  maximized  by  minimizing  T.  However,  due  to  the 
rms  bandwidth  constraint,  the  value  of  T  cannot  be  arbitrar¬ 
ily  small.  Using  the  fact  that  the  set  {\/^sin(!yl)}g1  is  a 
complete  orthonormal  set  in  the  space  of  all  rms  band-limited 
signals  in  [0,r]  [7],  we  can  express  s(t),  as 

4(0  =  (I5) 

1  =  1 

Then,  the  unit  energy  assumption  and  tl  e  constraint  in  the 
rms  bandwidth  (7)  translate  into 


OO 


and 

*=1 

00 

(16) 

respectively. 

£  i2d2  <  (2 BT)2 

1=1 

(17) 

The  minimum  T  consistent  with  (16)  and  (17)  is  chosen 
by  taking  equality  in  (17)  and  minimizing  the  ieft  hand  side 
of  (17)  subject  to  (16).  Since 


i  =  f>?<Ei2rf?  <“) 

i=l  :=1 

with  equality  if  and  only  if  dj  =  1  and  d,-  =  0  if  1  <  i,  it  follows 
that  the  optimum  T  is  equal  to  ^  which  upon  substitution 
in  (14)  results  in  the  desired  result.  | 


3.  Two-user  Channel 

We  turn  our  attention  to  the  main  results  of  the  paper, 
namely  the  optimization  of  the  capacity  region  of  the  syn¬ 
chronous  PAM  channel  (5b)  with  respect  to  the  choice  of  the 
signature  waveforms,  including  their  duration  T.  In  both  the 
rms  and  the  fobe  bandwidth  constrained  problems,  we  will 
solve  the  problem  in  two  stages: 


1.  Fix  T,  and  find  p‘{TB ),  the  minimum  absolute  cross¬ 
correlation,  \p\,  achievahle  under  the  time-bandwidth  con¬ 
straint  (and  the  optimal  waveforms  which  achieve  that  p.) 
Then,  the  capacity  region  for  fixed  T  is  given  by  Cy  in 
(Sb)  evaluated  at  p  =  p'{TB).  This  is  because  Cy  de¬ 
pends  on  the  signature  waveforms  only  through  the  rate- 
sum  constraint  which  is  monotonic  decreasing  in  p. 

2.  Take  the  union  of  the  capacity  regions  found  in  the  first 
stage  over  all  T.  Note  that  there  is  a  minimum  value  of  T 
below  which  the  time-bandwidth  product  is  so  small  that 
no  waveform  can  be  found  to  satisfy  the  bandwidth  con¬ 
straint  and  therefore,  the  capacity  region  is  an  empty  set. 
Also,  there  is  a  maximum  value  of  T  above  which  the  al¬ 
lowed  time-bandwidth  product  is  so  large  that  orthogonal 
signals  can  be  assigned  to  hoth  users,  and  therefore  the 
capacity  region  decreases  with  T  beyond  that  maximum 
value  of  T. 

Theorem  3.1. 

If  TB  >  0.5,  then  the  minimum  cross-correlation, 
Pq{TB),  between  any  two  unit-energy  signals  of  duration  T 
and  rms  bandwidth  less  than  or  equal  to  B  is 

p'G(TB)  =  max{0,j[5-8(rB)2]} 

and  is  achieved  by  the  signature  waveforms 

S*« 

IfT B  <  0.5,  then  there  exists  no  signal  of  duration  T  and 
rms  bandwidth  less  than  or  equal  to  B. 

Proof. 

If  TB  <  0.5,  we  have  seen  in  the  proof  of  Theorem  2.1, 
that  there  is  no  signal  of  duration  T  and  rms  bandwidth  less 
than  or  equal  to  B. 

If  TB  =  0.5,  we  have  seen  that  there  is  only  one  signal  of 
duration  T  and  rms  bandwidth  B  and  is  \pfcs\n  l  6  [0,T). 
Therefore,  the  theorem  follows  immediately  when  TB  =  0.5. 

If  TB  >  0.5,  let  si(t),  S2(0  he  any  two  unit-energy  sig¬ 
nals  with  duration  T  and  rms  handwidth  B.  Using  the  same 
complete  orthonormal  set  in  the  last  theorem,  we  denote  the 

vector  M(t)  =  sin(^),  ^ sin(^),...|r,  t  6  [0,7*1, 

and  express  si(/)  and  sj(t)  as 

sfc(0  =  «kM(0  *  =  1.2  (19) 

Then,  the  mu  handwidth  constraint  can  be  expressed,  via  (7), 
as 


is? * fl!  *  -  *-2  <“> 


where  II  =  diag(l2,22,32,...].  Denoting  p  as  the  cross¬ 
correlation,  we  can  assume  that,  without  loss  of  generality, 
0  <  p.  From  the  unit  energy  assumption,  we  have  the  cross- 
correlation  matrix,  H,  as 


H  =  AAr  = 


(.,.,!=[) ;]  (2D 


Since  the  mapping  between  s*(t)  and  is  an  one-to-one  map¬ 
ping,  the  problem  is  equivalent  to  finding  the  minimum  p  such 
that  there  exists  A  satisfying  (20)  and  (21). 

We  solve  this  problem  by  first  giving  a  lower  bound  on 
the  cross-correlation  and  then  showing  that  the  lower  bound 
is  achievable.  Let  Ba  be  the  minimum  of  the  sum  of  the 
rms  bandwidth  of  M  equal  energy  signals  of  duration  T  and 
correlation  matrix,  H.  Ba  is  found  hy  Nuttall  [7),  as 

=  (2 (22) 


where  each  pi  is  the  positive  eigenvalue  of  H  with  p,  <  p.  for 
j  <  *,  and  r  is  the  rank  of  H. 

Appling  this  result  with  AI  =  2,  r  =  2  (since  s\(t)  ■£  s,(t) 
implies  p  1)  and  the  correlation  matrix  H  in  (21),  we  "get 
from  (20)  and  (22)  that 


^((1+P)  +  4(1-P)1  <B2  (23) 

where  it  can  be  easily  verified  that  1  +  p  and  1  -  p  are  eigen¬ 
values  of  H  in  (21). 

After  rearrangement,  (23)  hecomes 

|[5  -  8(7'B)2)  <  p  (24) 

Since  st(f)  and  sj(t)  are  arbitrarily  chosen,  and  p  helongs  to 
[0, 1),  we  have  the  lower  hound, 


maxjo,  I[5  -  8(TB)2]}  <  p'G[TB) 

We  now  show  a  signal  pair  that  achieves  this  lower  bound. 
Stimulated  by  the  fact  that  the  functions  /(f)  and  /(T  -  t) 
have  the  same  magnitude  spectrum,  we  consider  signature 
waveforms  which  are  mirror  images  of  each  other  about  T/2. 
Also,  we  note  that  sin  y  is  even  about  T/2  while  sin  is 
odd  about  T/2.  Therefore,  we  assume  that  the  matrix  A  has 
the  form 


A 


Vl  —  a2  0  •  •  -1 
-Vl  -  a2  0  •••] 


(25) 


for  some  0  <  a  <  1. 

From  (20),  the  rms  bandwidth  constraint  becomes 
V/Bpl  <  a.  U  we  let  a  =  JEOpZ  and  substi- 

tute  (25)  into  (21),  we  have  p  =  2a2  -  1  =  if 

<  o,  <  J  and  we  can  let  a  =  j  which 

gives  p  ss  2a2  -1  =  0.  Therefore,  we  have  shown  that  the 

lower  bound  is  achievahle  hy  signature  waveforms  character¬ 
ized  by  the  matrix  A  in  (25),  with  a  =  \] Then, 
(19)  results  in  the  optimal  signature  waveforms  stated  in  the 
theorem,  a 


Theorem  3.2. 

The  capacity  region  of  the  two-user  PAM  white  Gaussian 
multiple-access  channel  with  noise  power  spectral  density,  rms 
bandwidth  and  signal  powers  equal  to  <r2,  B,  S\  and  Sj,  re- 


spectively,  is  given  by 


CG=  U 
i<T<y? 


(RuR2) 


0<R\<  flog(l+^] 
0<*2<flog(l  +  J%] 
iii  +  iJ2<flog(l  +  ^i2+ 
f^d-^-72)2)] 


••••  It  is  known  [8]  that  i!)q (TB,j>  -  j)  and  tp^TB.j  - 
j)  are  even  and  odd  about  ?  respectively  and  the  set 

f  -  |)|  forms  a  complete  orthonormal  set 

in  [0,7].  Also,  Ao {TB)  and  Ao(7B)  +  A \{TB)  are  continuous 
and  monotonic  increasing  in  TB  (Figure  4). 

Theorem  3.3. 


Proof. 

Recall  that  the  capacity  region,  Cq ,  is  the  union  of  CV 
in  (Sb)  evaluated  at  p‘(TB)  over  T.  We  proceed  to  find  the 
range  of  7  of  interest.  From  the  last  theorem,  if  TB  <  0.5,  no 
signature  waveforms  can  be  found  to  satisfy  the  constraints 

and  the  capacity  region  is  an  empty  set.  Also,  if  TB  > 
p'(T B)  =  0,  and  the  capacity  region  for  fixed  7  is  a  pentagon 
which  is  monotonic  decreasing  in  7.  Therefore,  the  range  of 

7  in  interest  is  the  interval  (^,  Denoting  2TB  by  7, 

and  substituting  7  into  CV  in  (5b),  we  have,  after  taking  the 
union,  Cq  in  the  theorem.  g 

At  a  first  glance,  it  seems  that  there  is  a  conflict  with 
Theorem  2.1  since  the  total  capacity  of  Cq  is  larger  than 
the  single-user  capacity  of  an  rms  band-limited  channel  with 
power  constraint  Si  +  S2.  However,  the  signal  transmitted 
over  the  channel  in  the  two-user  case  is  a  sum  of  two  PAM 
signals  and,  in  general,  it  is  no  longer  a  PAM  signal  since  the 
signals  in  different  time  slots  need  not  have  the  same  shape. 

Figure  1  shows  the  capacity  region  of  the  rms  band-limited 
PAM  two-user  channel,  Cq  and  the  strictly  band-limited  two- 
user  channel,  Cq.  In  contrast  to  the  single-user  case  where 
they  coincide,  Cq  is  a  subset  of  Cq.  It  can  also  be  seen 
from  (26)  and  (2)  that  Cq  is  the  pentagon  inside  the  union 
in  (26)  when  7  =  1.  However,  by  increasing  7,  we  trade  off 
the  decrease  in  the  single-user  rate  by  the  increase  in  the  rate 
sum,  such  that  the  union  gives  a  larger  capacity  region,  Cq. 
This  indicates  that,  in  the  two-user  case,  the  laxer  bandwidth 
constraint  more  than  offsets  the  additional  structure  (PAM) 
in  the  time  domain. 

Figure  2  and  3  show  the  signature  waveforms  which 
achieve  the  boundary  points  of  the  capacity  region  for  two 
different  time-bandwidth  products.  The  signature  waveforms 
are  mirror  images  of  each  other  and  as  7  increases,  they  be¬ 
come  more  asymmetric  so  as  to  decrease  the  cross-correlation 
while  maintaining  the  same  rms  bandwidth. 

Finally,  although  the  union  in  Theorem  3.2  is  taken  over 

7  in  the  interval  (1,  ^1),  not  every  7  in  that  interval  achieves 
some  boundary  points  of  Cq.  The  set  of  values  of  7  that 
achieves  boundary  points  of  Cq  is  a  function  of  the  signal 

to-noise  ratios,  ish*  k  =  1,2.  According  to  Figure  1,  the 
boundary  points  in  the  segments  AB  and  EF  are  achieved  by 
7=1,  while  those  in  the  segment  CD  are  achieved  by  some 

7 mlUt  in  (1,  depending  on  the  signal-to-noise  ratios.  The 

boundary  points  in  BC  and  DE  are  achieved  by  1  <  7  <  7m»x. 

We  now  proceed  to  the  optimal  signal  design  problem 
under  a-fobe  bandwidth  constraint.  Denote  the  prolate 
spheroidal  wave  functions  (  (12),  (13),  and  (14))  as  rpt(TB,t ) 
and  the  associated  eigenvalues  as  A,(7B),  i.e. 

\(TB)MTB,t)  =  MTB,  r)S>n|2^(ir-r)1dr 
for  »'  =  0,1,2,...  and  A0(7B)  >  A X(TB)  >  \2(TB)  > 


For  any  0  <  a  <  1, 

If  TB  >  Aj^fcr),  then  the  minimum  cross-correlation, 
p‘f(TB),  between  any  two  unit-energy  signals  of  duration  T, 
and  a-fobe  bandwidth  less  than  or  equal  to  B  is 


p‘f(TB)  =  max 


0, 


2a-  A0(7B)-  A,(7B)) 
A0(TB)  -  Aj(TB)  j 


and  is  achieved  by  the  signature  waveforms 


■  «1(0  = 

t  ~  i) +  t  ~  2) 


*2(0  = 

7  “  i)  -  t  ~  2) 

IfTB  <  A Q^(a),  then  there  exists  no  signal  of  duration 
T  and  a-fobe  bandwidth  less  than  or  equal  to  B. 


Proof. 


As  in  Theorem  3.1,  we  would  like  to  find  a  suitable  com¬ 
plete  orthonormal  set  in  [0,7].  To  that  end,  we  rewrite  the 
definition  of  a-fobe  bandwidth  as 


/-fl|S(/)|2d/ 

T  T  B 

=  I  Io  /  <S2r,(t~T)df  dt  dr 


Since  the  prolate  spheroidal  wave  functions  are  eigenfunctions 
of  the  kernel  27Bsinc(t  -  r),  a  good  choice  for  the  complete 
orthonormal  set  will  be  the  set  of  aU  prolate  spheroidal  wave 
functions. 

For  notational  convenience,  we  will  drop  the  explicit  de¬ 
pendence  on  TB  of  the  eigenvalues  of  the  prolate  spheroidal 
wave  functions.  UTB  >  AjJ^aJ.we  can  express  any  sj(f)  and 
S2(0  in  terms  of  ¥(f)  =  [^Jj-d’o^B.j  -  j),  -^-il>i(TB,  j-  - 
$),...],!  6  [0,7],  as 

H(t)  =  a£*(t)  k  =  1,2  (28) 

Using  (27)  and  (28),  we  have 

a  <  l  |S(/)|2  if  =  a?  Aa^  =  tr(Aakaj[)  *  =  1,2 
~  J-B 

(29) 

where  A  =  diag[Ao,Ai,A2,...].  Also,  the  cross-correlation 
matrix,  H,  is 


H  =  AAr  = 


(30) 


Similar  to  the  rms  case,  we  find  the  lower  bound  by  maximiz¬ 
ing  the  average  over  k  =  1,2  of  the  right  hand  side  of  (29). 
Rewriting  the  average,  we  have 

x  £  akATafc  =  itr(AATA) 
i=l 

=  ^tr(APA2APX) 

=  jtr(2APAAPA)  (31) 

where  A^A  is  diagonalized  by  the  orthonormal  matrix,  PA, 
and  2a  =  diag(;i  (3  0  ...].  Since  the  eigenvalues  of 
AAt  and  ATA  are  the  same,  we  have  &  =  1  +  p  and  = 

1-p. 

Now,  let’s  denote  P  as  the  2xoo  matrix  formed  by  taking 
only  the  first  two  rows  of  Pj,  and  2  as  diag[£i  &  ]•  Then, 
the  maximum  of  the  average  is 

max  tr(HPAPr)  (32) 

PPt=Ij*3 

We  will  solve  the  maximization  problem  using  the  Lagrange 
multiplier  method.  We  form  the  Lagrangjan, 

2  2  k 

£  fiPk  Apk  +  £  2n*( PkPn  -  Snk)  (33) 

fc=l  1=1  n=l 

where  pjT  is  the  k,h  row  of  P.  Taking  derivative  with  respect 
to  pjt,  we  have 

flApi  +  ziiPi  +  2ZJ2P3  =  0  (34) 

and 

&Ap2  +  Z22P3  +  2r12pi  =  0  (35) 

If  we  pre-multiply  (34)  by  p|"  and  (35)  by  pT,  we  have  rjj  = 
0  since  ft  /  ft;  therefore,  from  (34)  and  (35),  pj  and  pj 
are  eigenvectors  of  A.  Since  A  is  diagonal  and  the  diagonal 
elements  are  distinct  and  decreasing  down  the  diagonal  axis, 
we  have 

P  =  (l2rf  0]  (36) 

Substituting  back  into  (31),  we  show  that  the  maximum  value 
of  (31)  is  +  Ai^^.  Comparing  to  (29),  we  have 

Q  £  P(^o  “  ^i)l 

or,  together  with  0  <  p  <  1, 

f  2q  —  Ao  —  Aj  | 

^““T'  } 

The  achievability  of  the  lower  bound  can  be  verified,  as 
in  the  rms  case,  by  letting 


which  corresponds  to  the  optimal  set  of  signals  stated  in  the 
theorem. 

The  proof  of  the  second  part  of  the  theorem  (TB  < 
•^(g))  can  found  in  (13,  p.54).  | 


Theorem  3.4. 

The  capacity  region  o/  the  two-user  PAM  white  Gaussian 
multiple-access  channel  with  noise  power  spectral  density,  a- 
fobe  bandwidth  and  signal  powers  equal  to  a2,  B,  Si,  and  £->, 
respectively,  is  given  by 


CF  = 


U 


(*1,«2): 


where  a  =  A0(2^a)  =  ^(A0(^)  +  Ai(3^S)]. 

Proof. 

The  proof  is  very  similar  to  that  in  Theorem  3.2  where 
7  =  2 TB.  The  lower  limit  of  7  is  carried  over  from  Theorem 
3.3,  while  7mix  is  the  smallest  7  such  that  pj-(|)  =  0.  | 

Notice  that  the  range  of  7  in  taking  the  union  is  only  a 
function  of  a.  In  Figure  4,  we  show  A<> (TB)  and  j[Ao (TB)  + 
Ai(7\B)]  vs  the  time-bandwidth  product,  and  7m;n  and  7mtx 
can  be  obtained  directly  from  the  figure.  Also,  Figure  5 
shows  the  capacity  region,  Cf,  with  the  capacity  region  of 
the  strictly  band-limited  channel,  Cq-  Similar  comments  to 
those  we  made  in  the  rms  case  apply  to  the  values  of  7  that 
achieve  the  boundary  points  in  the  capacity  region.  However, 
we  see  that  for  sufficiently  high  a,  CF  does  not  contain  Cc  in 
contrast  to  the  rms  case. 

Finally,  in  Figure  6,  we  show  the  signature  waveforms 
which  are,  as  expected,  mirror  images  of  each  other.  However, 
in  contrast  to  the  rms  case  where  the  signature  waveforms 
must  be  zero  at  the  end  points  to  have  finite  rms  bandwidth, 
the  transmitted  signal  waveform  in  the  o-fobe  case  may  have 
jumps  at  t  =  iT. 


■'olil- 
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Figure  1.  Capacity  Regions  In  ths  mu  cm* 
tor  SNR1.SNR2.20db,  8-1  khz 


Figure  2.  Signature  waveforms  lor  two-utsr 
rmt  band-limited  chsnnsl,  TB-0.8 


Figure  3.  Signature  waveforms  lor  two-user 
rms  band-limited  channel.  TB-0.525 


time-bandwidth  product 


Rata  lor  User  1  (k  OiVmc) 
Figure  5.  Capacity  Regions  In  the  lobe  case 


lor  SNR1.SNR2.20db.  B.lkhz 


Figure  6.  Signature  wavelormt  lor  two-user 
0.8-lobe  bend-limited  channel,  TS.0.955 


"OTAL  CAPACITY  OF  THE  RMS  BANDLIMITED  JT-USER  PAM  SYN¬ 
CHRONOUS  CHANNEL 

iOGERS.  CHENG  &  SERGIO  VERDU  f 
JeDartment  of  Electrical  Engineering 
'rinceton  University,  Princeton,  NJ  08544 

vBSTRACT 

Continuous-time  additive  white  Gaussian  noise  channels  with  strictly  time-limited  and 
root  mean  square  (RMS)  bandlimited  inputs  are  studied.  RMS  bandwidth  is  equal  to  the 
normalized  second  moment  of  the  spectrum,  which  has  proved  to  be  a  useful  and  analytically 
tractable  measure  of  the  bandwidth  of  strictly  time-limited  waveforms. 

W<*  '''id  the  Total  Capacity  (TC)  of  the  JT-user  channel  under  total  power  and  power- 
weight  .a  average  RMS  bandwidth  constraints.  A  lower  bound  to  the  TC  under  equal-power 
constraint  is  obtained.  Total  Capacity  Ratio  (TCR)  is  defined  as  the  ratio  of  the  JT-user  TC 
to  K  times  the  single-user  capacity.  Power  (Bandwidth)  efficiency  is  defined  as  the  ratio  of  the 
effective  power  (bandwidth)  to  the  actual  power  (bandwidth).  The  effective  power  (bandwidth) 
is  the  corresponding  power  (bandwidth)  needed  for  a  single  user  channel  to  achieve  the  same 
capacity.  We  find  lower  bounds  to  the  TCR  and  efficiencies  which  indicate  that  savings  in 
bandwidth  compared  to  the  FDMA  scheme  can  be  achieved  by  the  CDMA  scheme  at  the 
expense  of  more  complicated  decoding  hardware. 

1.  INTRODUCTION 

In  this  paper,  we  deal  with  the  continuous-time  Pulse  Amplitude  Modulation  (PAM)  Gaus¬ 
sian  multiple-access  channel  (MAC).  Each  user  is  assigned  a  fixed  deterministic  continuous-time 
signature  waveform,  sjfe(f),  which  is  time-limited  to  [0,T]  and  is  modulated  linearly  by  the  in¬ 
formation  stream.  Assuming  that  the  transmitters  are  symbol-synchronous,  the  channel  can 
be  expressed  as 

»(*)  =  EE  6*(03*(*  -  iT)  +  n(t)  (1) 

»=l  k=l 

where  n(t)  is  white  Gaussian  noise  with  spectral  density,  ^  and  {bk(i)}  is  symbol  stream 
transmitted  by  the  user. 

The  capacity  region  of  this  channel  has  been  found  by  Verdu  [1]  [2].  Denoting  W  and  H 
as  the  diagonal  matrix  with  the  users’  powers  as  its  diagonal  entries,  and  the  cross-correlation 
matrix  of  the  normalized  signature  waveforms,  respectively,  the  capacity  region  is  expressed  as 

cv=  |  (*!.*, . RK ):  +  VJ  C  {1 . Ji"}  | 

2) 
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vnere  A  t  is  the  ]j|x|J|  matrix  formed  by  the  row  and  column  of  A  for  all  j  e""77~TtTs 
lear  that,  without  other  constraints,  the  capacity  region  is  maximized  by  orthogonal  signature 
-vaveiorms.  However,  under  bandwidth  constraints,  orthogonal  signature  waveforms  are  not 
lecessariiv  optimal  since  orthogonality  can  only  be  achieved  by  lowering  the  symbol  rate,  1/T. 

"here  are  manv  different  bandwidth  definitions  [3].  In  this  paper,  we  concentrate  on  the 
root  mean  square  (RMS)  bandwidth  because  it  is  analytically  tractable  and  can  be  applied  to 
strictly  time-limited  signals.  The  RMS  bandwidth  was  introduced  by  Gabor  [4]  and  studied 
subsequently  in  [3],  [5]  and  [6].  It  is  the  square  root  of  the  second  moment  of  the  energy 
spectral  density  (|5fc(/)J2)  of  the  normalized  signal  which  is  proportional  to  the  square  root  of 
the  energy  of  its  derivative. 

In  the  two-user  case,  the  capacity  region  of  the  RMS  bandlimited  PAM  channel  has  been 
found  in  [7]  and  the  total  capacity  (the  maximum  rate  sum  over  the  capacity  region)  is  larger 
than  the  single-user  capacity  with  the  power  equal  to  the  sum  of  the  users’  powers.  The  gain 
in  the  total  capacity  from  the  single-user  to  the  two- user  case  can  be  explained  by  the  increase 
in  the  dimensionality  of  the  signal  set.  We  can  consider  the  transmitted  signal  in  a  symbol 
interval  as  a  signal  drawn  from  a  signal  set.  Then,  the  signal  set-in  the  single-user  and  the 
two-user  case  are  one-dimensional  and  two-dimensional,  respectively.  Prom  this  viewpoint,  it 
is  easy  to  see  that  the  total  capacity  increases  as  the  number  of  users  increases  while  the  total 
power  remains  constant. 

In  this  paper,  we  find  the  total  capacity  (TC)  of  the  if -user  channel  under  the  total  power 
constraint 

tr(W)  <  W  (3) 

and  the  power-weighted  averaged  RMS  bandwidth  constraint 

(4) 

The  power  constraint  is  placed  on  the  total  power  instead  of  the  individual  power  since  the  later 
requires  finding  all  possible  sets  of  eigenvalues  of  a  positive  definite  matrix  with  fixed  diagonal 
entries  which  is,  in  general,  intractable.  The  bandwidth  constraint  is  justified  because  the 
power-weighted  average  RMS  bandwidth  is  the  RMS  bandwidth  of  the  power  spectral  density 
of  the  transmitted  signal. 

Several  performance  measures,  Total  Capacity  Ratio,  Power  efficiency  and  Bandwidth 
efficiency,  are  defined  and  analyzed.  Bounds  and  limiting  values  of  these  measures  are  also 
obtained. 


2.  TOTAL  CAPACITY 


Theorem  2.1. 

The  Total  Capacity  of  the  K-user  RMS  bandlimited  PAM  Gaussian  MAC  with  total  power 
ma  vower-weighted  average  RMS  bandwidth  constraints  is 

B  N  ) 

nC(B,  K,  A)  =  max  {  -  £  log[l  +  hn(X)}  J  loge 
7  n=l  J 


(5) 


where  the  maximization  is  over  1  <  N  <  K,  <  A,  and  1  <  7  <  such  that  7  =  \fftf 
iff  ^  =  1jr>  and 

N 

£MA)  =  7*A  (6). 

n=l 

TZj/lCT'C 

h  m  A  NXftf  4-  7-K~A[A(72  -  1)  -  l)  -  n2(WA  —  'fKA)  > 

}  NUn  -  1  -  A)  +  73Jf  A  +  n2(NX  -  'fKA)  1  } 

In  =  -^f>2  =  |(JW  +  l)(2iV  +  1)  (8) 

n=l 

and  </te  average  signal-to-noise  ratio  is  denoted  by 

•  • « 


Proof. 

Since  the  signature  waveforms  are  RMS  bandlimited,  and  the  set  {$;(<,  T)}^  where 


<Pi(t,T)  =  ( ^/^sin(^)  if<e[0,T]; 

(.  0  otherwise. 


forms  a  complete  orthonormal  basis  for  all  RMS  bandlimited  signals,  we  can  write 


=  AT$(t,r) 


where  $(f,T)  =  [<p\(t,T)  tfa(t,T )  . .  .]T .  Then,  the  power-weighted  average  RMS  bandwidth 
constraint  can  be  written,  via  the  Parsaval’s  theorem,  as 

I^n  i  w“  [  ^ dt  -  "<WATnA> 


(2T)2tr(W) 


tr(HAWAr) 


From  the  capacity  region  in  (2),  it  is  clear  that  the  total  capacity  is  maximized  when 
tr(W)  =  W.  We  denote  the  time-bandwidth  product  by  7  =  2 BT,  the  average  signal-to- 
noise  ratio  by  A  =  7T^F>  the  eigenvalues  of  j^WH  by  A^  such  that  A j  <  Aj,  Vi  <  j  <  K . 
Then,  the  total  capacity  becomes 


TCv  =  -i:iog[l  +  AJfe] 
7  Jt=i 


and  the  power  constraint  becomes 


E **  = 

k-i 


(14) 


Since  the  eigenvalues  of  ^\VH  are  also  the  eigenvalues  of  ^AWA^,  and  once  {^k}jf=1  are 
fixed,  the  left  hand  side  of  (12)  is  minimized  when  AWA7  is  diagonal  with  decreasing  diagonal 
entries,  we  can  rewrite  (12)  sis 

E  k2h  <  73*  A  (15) 

k=l 

For  fixed  T,  the  total  capacity  is  found  by  maximizing  (13)  over  all  A*  >  0,  k  =  1, . . . ,  K 
under  the  constraints  (14)  and  (15).  Using  the  Kuhn-Tucker  Theorem,  v?e  form  the  Lagrangian 

K  K  K 

-  E)  logfl  +  A*]  +  x(  E  h  -  7*  A)  +  y(  E  fc2 A*  -  ^KA)  (16) 

Jfe=l  ife=l  Je=l 

and  obtain  the  necessary  conditions: 

A„  =  - — r-1  >  0  n=l,...,N.  (17) 

x  +  ynz 

and  An  =  0  for  all  n  >  N , 

N 

y(E  n2\n-l3KA)  =  ii  (18) 

n=l 

and  0  <  y. 

Rewriting  (17)  as  (i  +  yn2)(l  +  An)  =  1,  and  summing  over  all  n,  we  have,  from  (14)  and 
(18), 

(N  +  yK\)x  +  (N{„  +  -r’ifAJy  =  N 
Particularizing  (17)  to  n  =  1,  and  substituting  in  (19),  we  have 

iVAi  -  yKA 

y~  (i  +  \i)(N(fi,-i)  +  yKA{j*-i)) 

and 

N(fN-l-M)  +  ^KA 

X  (1  +  “  1)  +  7-^A(72  —  1)) 

Substituting  (20)  and  (21)  into  (17),  and  denoting  Aj  by  A  and  An  by  hn( A),  we  have  (7),  and 
the  power  constraint  in  (14)  becomes  (6). 

When  y  —  0,  A  =  /in(A)  =  for  all  n  =  1  Upon  substituting  into  (15),  we 

have  \/7]v  <  7>  Since  the  total  capacity  becomes  ^y^log[l  4-  which  is  monotonically 
decreasing  in  7,  the  optimal  7  is  equal  to  vTiV  aQd  (15)  is  satisfied  with  equality.  If  we  rewrite 
(7)  and  sum  up  over  all  n,  we  have 

N 

(N Ai  -  yK A)(  E  n2hn( A)  -  ^Kh)  =  0  (22) 

1 

When  0  <  y,  <  A,  and  from  (22),  (15)  is  again  satisfied  with  equality.  Therefore,  if  we 
require  7  =  s/JJj  iff  A  =  (15)  is  superfluous.  Finally,  specifying  the  range  of  7  and  A 

and  the  condition  that  7  =  \/7/7  iff  A  =  we  have  the  desired  result.  | 


(19) 

(20) 

(21) 


This  theorem  gives  the  exact  calculation  needed  for  the  TC.  The  main  reason  why  we  can¬ 
not  obtain  a  simpler  solution  is  the  lack  of  a  closed  form  expression  of  SjJLi  a+1^y .  However, 


despite  the  complicated  expression,  the  TC  can  be  computed  once  the  average  signal- to-noise 
ratio,  A,  and  K  is  given.  In  Figure  1,  we  show  the  TC  with  different  values  of  K  and  A. 

For  a  given  W,  we  show  that  any  set  of  signature  waveforms,  with  A  such  that  AWAr 
is  a  diagonal  matrix  with  the  ntl1  diagonal  entry  equal  to  hn(X),  is  optimal.  However,  such  an 
A  does  not  always  exist  for  any  arbitarily  given  W.  For  fixed  total  power,  W,  finding  the  set 
of  W  where  A  exists  is  equivalent  to  finding  the  possible  set  of  diagonal  entries  of  a  positive 
definite  matrix  with  fixed  eigenvalues,  which  seems  intractable.  Reversing  the  problem,  one 
may  want  to  fix  the  W  and  find  the  total  capacity.  In  general,  this  is  equivalent  to  finding 
the  possible  set  of  eigenvalues  of  a  positive  definite  matrix  with  fixed  diagonal  entries,  which 
is  again  intractable. 

In  the  following  theorem,  we  give  a  lower  bound  to  the  TC  in  the  equal-power  case  where 
W  =  yl.  Cleariy,  this  is  also  a  lower  bound  to  the  capacity  of  the  original  channel  with  the 
total  power  constraint  in  Theorem  2.1. 

Theorem  2.2. 

The  lower  bound  to  the  Total  Capacity  when  the  users’  powers  are  the  same  is 

TC ep(B,  K,  A)  >  max  -  log  { [1  +  7a' ^  -  1}  +  K{fjC  ^j[l  +  tA^-]*"1} 

1<7<V7k:  7  {  JK~l  JK- 1  J 

(23): 


Proof. 


The  lower  bound  is  found  by  exhibiting  a  symmetric  positive  definite  matrix  H,  such  that 
the  total  capacity  for  that  particular  signature  waveform  set  is  easy  to  find.  We  let  H  be 

(24) 

the  eigenvalues  of  H  with  0  <  p  <  1  to  be  be  specified  in  the  sequel  are  1  +  (K  —  1  )p  and  1  —  p 
with  multiplicity  K  -  1.  Then,  the  total  capacity  under  the  equal-power  constraint  becomes 


1  p  •••  p 

pi'-.! 

’*•  P 
P  1. 


TCV=  -log[det(I*  +  7AH)] 

7 

=  ^  log  {(1  +  7A(1  +  ( K  -  l)p)][l  +  7A(1  -  p)]*"1}  (25) 


while  the  bandwidth  constraint  (15)  becomes 


P  > 


}k  ~  72 

/jr-i 


(26) 


Since  (25)  is  monotonically  decreasing  in  p  when  0  <  p  <  1,  the  TC  is  maximized  when  p 
achieves  equality  in  (26).  Substituting  p  from  (26)  with  equality  into  (25),  and  maximizing 
over  all  7,  we  have  (23).  | 


In  Figure  1,  we  plot  the  lower  bound  to  the  TCgp  for  "different  values  of  K  and  A.  Since 
the  TC  under  the  total  power  constraint  serves  as  an  upper  bound  to  the  TQep,  Figure  1  gives 
a  tight  upper  and  lower  bound  to  the  Total  Capacity  of  a  equal-power  constrained  channel,  for 
moderate  number  of  users. 

As  a  performance  measure,  we  define  the  Total  Capacity  Ratio  (TCR)  as  the  ratio  of  the 
AT-user  TC  to  K  times  the  single-user  capacity  with  the  same  RMS  bandwidth  and  average 
signal-to-noise  ratio  constraints.  Since  the  single-user  capacity  of  a  RMS  bandlimited  PAM 
channel  is  equal  to  Rlogfl  +  A]  (see  [7]),  the  TCR  can  be  written  as 


TCR(R,A)  = 


TC{B,K,A) 

jmog[i  +  A] 


(27) 


The  TCR  gives  the  ratio  of  the  capacity  available  to  an  average  user  (when  the  channel  is 
shared  by  K  users)  to  the  single-user  capacity.  In  other  words,  it  measures,  from  the  user’s 
viewpoint,  the  ratio  of  the  average  user  capacity  in  a  multi-user  channel  to  the  capacity  in  a 
single-user  channel.  Notice  that  the  TCR  depends  only  on  K  and  A,  and  is  independent  of 
B.  Using  the  lower  bound  in  Theorem  2.2,  we  obtain  a  lower  bound  to  the  TCR  under  the 
equal-power  constraint  for  all  signal-to-noise  ratios. 


Corollary  2.1. 

A  lower  bound  to  the  TCR  under  the  equal-power  constraint  for  all  signal-to-noise  ratio  is 

TCR(if,  A)  >  ^  (28)' 

where  7  is  the  positive  real  root  of  the  equation 


7(72  “  1  )  =  /*■-  1 


(29) 


Proof. 

In  order  to  obtain  (28),  we  simply  substitute  7  from  (29)  into  (23)  and  (27).  Since  there 
is  one  and  only  one  real  positive  solution  in  (29),  there  is  no  ambiguity  in  the  value  of  7.  | 

In  Figure  2,  we  show  the  TCR  under  the  total  power  constraint  and  the  lower  bound  to 
the  TCR  under  the  equal-power  constraint  for  different  number  of  users  and  different  average 
signal-to-noise  ratios. 


3.  EFFICIENCIES 

The  TCR  gives  the  performance  degradation,  from  the  user’s  viewpoint,  when  a  bandlim¬ 
ited  channel  is  shared  by  K  users  instead  of  a  single  user.  A  natural  question  to  be  asked  is 
“How  to  maintain  the  same  rate  in  the  presence  of  other  users?”  If  we  want  to  maintain  the 
same  information  rate,  we  have  to  modify  some  of  the  parameters.  In  the  following,  we  will 
analyze  two  alternatives.  First,  we  increase  the  signal-to-noise  ratio  by  increasing  the  power 
while  the  bandwidth  remains  constant.  Second,  we  increase  the  bandwidth  of  the  channel 
while  the  power  of  each  user  remains  the  same. 


'Tie  power  efficiency,  denoted  by  r)p(K,A),  is  defined  as  '  ' 

TC(B,K,A) 

w(ff.A)  =  - j- —  (30) 

or,  equivalently,  implicitly  as 

TC(5,  if,  A)  =  BK  log[l  +  ^(AT,  A)  A]  (31) 

The  bandwidth  efficiency,  denoted  by  775(1 iT,  A),  is  defined  implicitly  as 

TC(5,  AT,  A)  =  VB(  K,  A)BK\og[l  +  *  ]  (32) 

The  power  efficiency,  r]p(K,A)  (bandwidth  efficiency,  tjb(K ,A))  gives  the  ratio  of  the 
effective  power  (bandwidth)  to  the  actual  power  (bandwidth)  when  the  actual  signal-to-noise 
ratio  is  A.  The  actual  power  (bandwidth)  is  the  power  (bandwidth)  used  in  transmission  while 
the  effective  power  (bandwidth)  is  the  corresponding  power  (bandwidth)  needed  for  a  single 
user  channel  to  achieve  the  same  capacity.  In  other  words,  —  101og[77p(Af,  A)]  gives  the  power 
in  db  that  we  have  to  add  to  each  user  in  order  to  maintain  the  single-user  capacity.  Similarly, 
1/t]b(K,  A)  gives  the  ratio  that  we  have  to  increase  the  bandwidth  in  order  to  maintain  the' 
same  information  rate. 

Theorem  3.1. 

The  power  efficiency  satisfies, 

lim  tip(K,A)  =  0  (33) 

A— *oo 

A  lower  bound  to  the  bandwidth  efficiency,  775 (AT,  A),  for  all  signal-to-noise  ratio  under  the 
equal-power  constraint  is 

7B(tf,A)>-i=  (34) 

where  fx  is  defined  in  (8). 

’roof. 

:,rom  the  definition  of  v p{K,  A),  we  have, 

/ 

J  (1 +  M*))  =  |1 +  -»>(*■,  A)A)*7  (35) 

n= 1 

where  N ,  An(A)  and  7  are  all  optimally  selected  for  that  A.  Subtracting  (14)  from  (15),  it  is 
easy  to  get 

M*)  <  7*A(72  -  1)  n—  1,.  ,.,N.  (36) 

Substituting  (36)  into  (35),  and  dividing  both  side  by  A^'r,  we  have,  in  the  limit  as  A  -*  00, 


If  -y  — »■  1  as  A  — *•  oo,  the  second  factor  on  the  left  hand  side  of~(377 tends  to  0,  while  the  first 
factor  tends  either  to  0  (N  <  K)  or  to  1  (N  =  K).  On  the  other  hand,  if  7  -*  a  >  1  as 
A  — ►  00,  the  .first  factor  on  the  left  hand  side  of  (37)  tends  to  0  for  any  N  <  K,  while  the 
second  factor  is  bounded.  Therefore,  in  both  cases,  the  left  hand  side  tends  to  0  and  since 
1  <  7  <  V7k>  we  have  rjp{K,  A)  ->  0  as  A  00. 

Substituting  7  =  v75F  in  Theorem  2.2,  we  have 


TCep(B,  K,  A)  > 


~=BKlog[l  +  ^A] 


(38) 


Since  the  right  hand  side  of  (32)  is  monotonically  increasing  in  tiq(K,A),  we  have  (34)  when 
compared  to  (38).  | 


The  TC  is  obtained  by  optimizing  the  balance  between  the  “symbol  rate”  factor,  B/ 7,  and 
the  “information  sent  per  symbol”  factor,  log(-  ■  •).  As  the  average  signal-to-noise  ratio  tends 
to  infinity,  the  “symbol  rate”  factor  dominates  and  the  optimal  users’  signature  waveforms  are 
asymptotically  identical.  Then,  the  product  term  of  the  signal-to-noise  ratios  inside  the  log 
function  in  the  TC  becomes  relatively  small,  and  the  asymptotic  power  efficiency  is  equal  to 
zero.  The  bandwidth  efficiency  indicates  the  increase  in  bandwidth  needed  to  maintain  the 
same  user  rate  when  a  single-user  channel  is  shared  by  K  users. 

In  Figure  3  and  4,  we  plot  the  Power  and  Bandwidth  efficiency  for  different  valueB  of  K 
and  A.  Also,  in  the  same  graphs,  we  show  the  lower  bound  to  the  efficiencies  for  the  equal- 
power  constrained  channel.  It  shows  that  regardless  of  the  signal-to-noise  ratios,  increasing  the 
bandwidth  by  a  factor  of  10,  we  can  accommodate  about  50  users  on  the  multi-user  channel. 
This  indicates  a  80  percent  reduction  in  the  bandwidth  required  by  Frequency  Division  Mul¬ 
tiple  Access  (FDMA).  Cleanly,  the  tradeoff  is  a  more  complicated  demodulating  and  decoding 
process  in  the  Synchronous  Code  Division  Multiple  Access  (CDMA)  channel,  which  is  a  special 
case  of  the  current  model. 
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Figure  1.  Total  Capacity  for  different  number  of  users 
and  average  signal-to-noise  ratios  when  B=1khz 


Figure  2.  Total  Capacity  Ratio  for  different  number 
of  users  and  average  signal-to-noise  ratios 
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Figure  3.  Bandwidth  efficiency  for  different  number 
of  users  and  average  signal-to-noise  ratios 
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ciqure  4.  Power  efficiency  for  different  number 
if  users  and  average  signal-to-noise  ratios 


